Fundamentals of Multimedia

23
Texts in Computer Science Series Editors David Gries, Department of Computer Science, Cornell University, Ithaca, NY, USA Orit Hazzan , Faculty of Education in Technology and Science, TechnionIsrael Institute of Technology, Haifa, Israel

Transcript of Fundamentals of Multimedia

Texts in Computer Science

Series Editors

David Gries, Department of Computer Science, Cornell University, Ithaca, NY,USA

Orit Hazzan , Faculty of Education in Technology and Science, Technion—IsraelInstitute of Technology, Haifa, Israel

More information about this series at http://www.springer.com/series/3191

Ze-Nian Li • Mark S. Drew •

Jiangchuan Liu

Fundamentalsof MultimediaThird Edition

123

Ze-Nian LiSchool of Computing ScienceSimon Fraser UniversityBurnaby, BC, Canada

Mark S. DrewSchool of Computing ScienceSimon Fraser UniversityBurnaby, BC, Canada

Jiangchuan LiuSchool of Computing ScienceSimon Fraser UniversityBurnaby, BC, Canada

ISSN 1868-0941 ISSN 1868-095X (electronic)Texts in Computer ScienceISBN 978-3-030-62123-0 ISBN 978-3-030-62124-7 (eBook)https://doi.org/10.1007/978-3-030-62124-7

1st edition: © Prentice-Hall, Inc. 20042nd edition: © Springer International Publishing Switzerland 20143rd edition: © Springer Nature Switzerland AG 2021This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionor information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in thispublication does not imply, even in the absence of a specific statement, that such names are exempt fromthe relevant protective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in thisbook are believed to be true and accurate at the date of publication. Neither the publisher nor theauthors or the editors give a warranty, expressed or implied, with respect to the material containedherein or for any errors or omissions that may have been made. The publisher remains neutral with regardto jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AGThe registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

To my mom, and my wife Yansin.

Ze-Nian

To Noah, Ira, Eva and, especially, to Jenna.

Mark

To my wife Jill, and my children Kevin, Jerry,and Kathy.

Jiangchuan

Preface

In the 17 years since the first edition of Fundamentals of Multimedia, the field andapplications of multimedia have flourished and are undergoing evermore rapidgrowth and evolution in various emerging interdisciplinary areas. However, acomprehensive textbook to aid the continuous learning and mastering of the fun-damental concepts and knowledge in multimedia remains essential.

While the original edition was published by Prentice-Hall, starting from thesecond edition we have chosen Springer, a prestigious publisher that has a superband rapidly expanding array of computer science textbooks, particularly thehigh-quality, dedicated, and established textbook series: Texts in Computer Sci-ence, of which this textbook forms a part. The second edition included considerableadded depth to the networking aspect of the book. To this end, Dr. Jiangchuan Liuwas added to the team of authors.

This third edition again constitutes a significant revision: the textbook has beenthoroughly revised and updated to include recent developments in the field. Forexample, we updated the introduction to some of the current multimedia tools, weincluded current topics such as 360� video and the video coding standard H.266;new-generation social, mobile, and cloud computing for human-centric interactivemultimedia, augmented reality and virtual reality; deep learning for multimediaprocessing; and their attendant technologies.

Multimedia is associated with a rich set of core subjects in Computer Scienceand Engineering, and we address those here. The book is not an introduction tosimple design considerations and tools—it serves a more advanced audience thanthat. On the other hand, the book is not a reference work—it is more a traditionaltextbook. While we perforce may discuss multimedia tools, we would like to give asense of the underlying issues at play in the tasks those tools carry out. Studentswho undertake and succeed in a course based on this text can be said to reallyunderstand fundamental matters in regard to this material, hence the title of the text.

In conjunction with this text, a full-fledged course should also allow students tomake use of this knowledge to carry out interesting or even wonderful practicalprojects in multimedia; interactive projects that engage and sometimes amuse; and,perhaps, even teach these same concepts.

vii

Who Should Read This Book?

This text aims at introducing the basic ideas used in multimedia, for an audiencethat is comfortable with technical applications, e.g., Computer Science students andEngineering students. The book aims to cover an upper level undergraduate mul-timedia course, but could also be used in more advanced courses. Indeed, a (quitelong) list of courses making use of the first two editions of this text includes manyundergraduate courses as well as use as a pertinent point of departure for graduatestudents who may not have encountered these ideas before in a practical way. Aswell, the book would be a good reference for anyone, including those in industry,who are interested in current multimedia technologies. The selection of material inthe text addresses real issues that these learners will be facing as soon as they showup in the workplace. Some topics are simple, but new to the students; some aresomewhat complex, but unavoidably so in this emerging area.

The text mainly presents concepts, not applications. A multimedia course, on theother hand, teaches these concepts, and tests them, but also allows students toutilize skills they already know, in coding and presentation, to address problems inmultimedia. The accompanying website materials for the text include some code formultimedia applications along with some projects students have developed in sucha course, plus other useful materials best presented in electronic form.

Have the Authors Used This Material in a Real Class?

Since 1996, we have taught a third-year undergraduate course in MultimediaSystems based on the introductory materials set out in this book. A one-semestercourse very likely could not include all the material covered in this text, but wehave usually managed to consider a good many of the topics addressed, withmention made of a selected number of issues in Parts 3 and 4, within that timeframe.

As well, over the same time period and again as a one-semester course, we havealso taught a graduate-level course using notes covering topics similar to the groundcovered by this text, as an introduction to more advanced materials. A fourth-yearor graduate-level course would do well to discuss material from the first three partsof the book and then consider some material from the last part, perhaps in con-junction with some of the original research references included here along withresults presented at topical conferences.

We have attempted to fill both needs, concentrating on an undergraduate audi-ence but including more advanced material as well. Sections that can safely beomitted on a first reading are marked with an asterisk in the Table of Contents.

viii Preface

What is Covered in This Text?

In Part 1, Introduction and Multimedia Data Representations, we introduce someof the notions included in the term Multimedia, and look at its present as well as itshistory. Practically speaking, we carry out multimedia projects using software tools,so in addition to an overview of multimedia software tools we get down to someof the nuts and bolts of multimedia authoring. The representation of data is criticalin the study of multimedia, and we look at the most important data representationsfor use in multimedia applications. Specifically, graphics and image data, videodata, and audio data are examined in detail. Since color is vitally important inmultimedia programs, we see how this important area impacts multimedia issues.

In Part 2, Multimedia Data Compression, we consider how we can make all thisdata fly onto the screen and speakers. Multimedia data compression turns out to be avery important enabling technology that makes modern multimedia systems pos-sible. Therefore, we look at lossless and lossy compression methods, supplying thefundamental concepts necessary to fully understand these methods. For the lattercategory, lossy compression, arguably JPEG still-image compression standards,including JPEG 2000, are the most important, so we consider these in detail. Butsince a picture is worth 1,000 words, and so video is worth more than a millionwords per minute, we examine the ideas behind the MPEG standards MPEG-1,MPEG-2, MPEG-4, MPEG-7, and beyond into modern video coding standardsH.264, H.265, and H.266. Audio compression is treated separately and we considersome basic audio and speech compression techniques and take a look at MPEGAudio, including MP3 and AAC.

In Part 3, Multimedia Communications and Networking, we consider the greatdemands multimedia communication and content sharing place on networks andsystems. The Internet, however, was not initially designed for multimedia contentdistribution and there are significant challenges to be addressed. We discuss thewired Internet and wireless mobile network technologies and protocols, and theenhancements of them that make multimedia communications possible. We furtherexamine state-of-the-art multimedia content distribution mechanisms, as well asmodern cloud computing for highly scalable multimedia data processing. Thediscussion also includes the latest edge computing and serverless computingsolutions toward fine-grained and flexible realtime multimedia.

In Part 4, Human-Centric Interactive Multimedia, we examine a number oftechnologies that form the heart of enabling the new Web 2.0 paradigm, with richuser interactions. Such popular Web 2.0-based social media sharing websites asYouTube, Facebook, Twitter, Twitch, and TikTok have drastically changed thecontent generation and distribution landscape, and indeed have become an integralpart in people’s daily life. The developments in the coding algorithms and hardwarefor sensing, communication, and interaction also empower virtual reality (VR) andaugmented reality (AR), providing better immersive experiences beyond 3D. Thispart examines these new-generation interactive multimedia services and discussestheir potential and challenges. The huge amount of multimedia content also

Preface ix

militates for multimedia-aware search mechanisms, and we therefore consider thechallenges and mechanisms for multimedia content search and retrieval.

Textbook Website

The book website is http://www.cs.sfu.ca/mmbook. There the reader will findgeneral information about the book including previous editions, an errata sheetupdated regularly, programs that help demonstrate concepts in the text, and adynamic set of links for the “Further Exploration” section in some of the chapters.Since these links are regularly updated, and of course URLs change quite often, thelinks are online rather than within the printed text.

Instructors’ Resources

The main text website has no ID and password, but access to sample studentprojects is at the instructor’s discretion and is password-protected. For instructors,with a different password, the website also contains Course Instructor resources foradopters of the text. These include an extensive collection of online slides, solutionsfor the exercises in the text, sample assignments and solutions, sample exams, andextra exam questions.

Acknowledgments

We are most grateful to colleagues who generously gave of their time to review thistext, and we wish to express our thanks to Edward Chang, Shu-Ching Chen,Qianping Gu, Mohamed Hefeeda, Rachelle S. Heller, Gongzhu Hu, S. N. Jayaram,Tiko Kameda, Joonwhoan Lee, Xiaobo Li, Jie Liang, Siwei Lu, Jiebo Luo, andJacques Vaisey.

The writing of this text has been greatly aided by a number of suggestions andcontributions from present and former colleagues and students. We would like tothank Mohamed Athiq, James Au, Yi Ching David Chou, Chad Ciavarro, HosseinHajimirsadeghi, Hao Jiang, Mehran Khodabandeh, Steven Kilthau, Michael King,Tian Lan, Chenyu Li, Haitao Li, Cheng Lu, Minlong Lu, You Luo, Xiaoqiang Ma,Hamidreza Mirzaei, Peng Peng, Haoyu Ren, Ryan Shea, Chantal Snazel, WenqiSong, Yi Sun, Dominic Szopa, Zinovi Tauber, Malte von Ruden, Fangxin Wang,Jian Wang, Jie Wei, Edward Yan, Osmar Zaïane, Cong Zhang, Lei Zhang, MiaoZhang, Wenbiao Zhang, Yuan Zhao, Ziyang Zhao, William Zhong, Qiang Zhu, andYifei Zhu for their assistance. Yi Ching David Chou also helped with refreshing thecompanion website for the textbook. As well, Dr. Ye Lu made great contributions

x Preface

to Chaps. 8 and 9; Andy Sun contributed Chap. 20. Their valiant efforts are par-ticularly appreciated. We are also most grateful for the students who generouslymade their course projects available for instructional use for this book.

Burnaby, Canada Ze-Nian LiMark S. DrewJiangchuan Liu

Preface xi

Contents

Part I Introduction and Multimedia Data Representations

1 Introduction to Multimedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 What is Multimedia? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Components of Multimedia . . . . . . . . . . . . . . . . . . . 41.2 Multimedia: Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Early History of Multimedia . . . . . . . . . . . . . . . . . . 51.2.2 Hypermedia, WWW, and Internet . . . . . . . . . . . . . . 91.2.3 Multimedia in the New Millennium . . . . . . . . . . . . . 13

1.3 Multimedia Software Tools: A Quick Scan . . . . . . . . . . . . . . . 161.3.1 Music Sequencing and Notation . . . . . . . . . . . . . . . 171.3.2 Digital Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.3.3 Graphics and Image Editing . . . . . . . . . . . . . . . . . . 181.3.4 Video Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.3.5 Animation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.3.6 Multimedia Authoring . . . . . . . . . . . . . . . . . . . . . . . 201.3.7 Multimedia Broadcasting . . . . . . . . . . . . . . . . . . . . . 21

1.4 The Future of Multimedia . . . . . . . . . . . . . . . . . . . . . . . . . . . 221.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 A Taste of Multimedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.1 Multimedia Tasks and Concerns . . . . . . . . . . . . . . . . . . . . . . 272.2 Multimedia Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.3 Data Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.4 Multimedia Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.5 Multimedia Sharing and Distribution . . . . . . . . . . . . . . . . . . . 392.6 Some Useful Editing and Authoring Tools . . . . . . . . . . . . . . . 41

2.6.1 Adobe Premiere . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.6.2 HTML5 Canvas . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.6.3 Adobe Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.6.4 Adobe XD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

xiii

2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3 Graphics and Image Data Representations . . . . . . . . . . . . . . . . . . . 573.1 Graphics and Image Data Types . . . . . . . . . . . . . . . . . . . . . . 57

3.1.1 1-Bit Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.1.2 8-Bit Gray-Level Images . . . . . . . . . . . . . . . . . . . . . 583.1.3 Image Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 623.1.4 24-Bit Color Images . . . . . . . . . . . . . . . . . . . . . . . . 623.1.5 Higher Bit-Depth Images . . . . . . . . . . . . . . . . . . . . 623.1.6 8-Bit Color Images . . . . . . . . . . . . . . . . . . . . . . . . . 633.1.7 Color Lookup Tables (LUTs) . . . . . . . . . . . . . . . . . 65

3.2 Popular File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.2.1 GIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.2.2 JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.2.3 PNG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.2.4 TIFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.2.5 Windows BMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 753.2.6 Windows WMF . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.7 Netpbm Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.8 EXIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.9 HEIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.10 PS and PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.2.11 PTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4 Color in Image and Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.1 Color Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.1.1 Light and Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . 834.1.2 Human Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.1.3 Spectral Sensitivity of the Eye . . . . . . . . . . . . . . . . . 854.1.4 Image Formation . . . . . . . . . . . . . . . . . . . . . . . . . . 864.1.5 Camera Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.1.6 Gamma Correction . . . . . . . . . . . . . . . . . . . . . . . . . 884.1.7 Color-Matching Functions . . . . . . . . . . . . . . . . . . . . 914.1.8 CIE Chromaticity Diagram . . . . . . . . . . . . . . . . . . . 924.1.9 Color Monitor Specifications . . . . . . . . . . . . . . . . . . 964.1.10 Out-of-Gamut Colors . . . . . . . . . . . . . . . . . . . . . . . 964.1.11 White Point Correction . . . . . . . . . . . . . . . . . . . . . . 974.1.12 XYZ to RGB Transform . . . . . . . . . . . . . . . . . . . . . 994.1.13 Transform with Gamma Correction . . . . . . . . . . . . . 99

xiv Contents

4.1.14 L*a*b* (CIELAB) Color Model . . . . . . . . . . . . . . . 1004.1.15 More Color Coordinate Schemes . . . . . . . . . . . . . . . 1024.1.16 Munsell Color Naming System . . . . . . . . . . . . . . . . 102

4.2 Color Models in Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1024.2.1 RGB Color Model for Displays . . . . . . . . . . . . . . . . 1024.2.2 Multi-sensor Cameras . . . . . . . . . . . . . . . . . . . . . . . 1034.2.3 Camera-Dependent Color . . . . . . . . . . . . . . . . . . . . 1034.2.4 Subtractive Color: CMY Color Model . . . . . . . . . . . 1054.2.5 Transformation from RGB to CMY . . . . . . . . . . . . . 1054.2.6 Undercolor Removal: CMYK System . . . . . . . . . . . 1064.2.7 Printer Gamuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1064.2.8 Multi-ink Printers . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.3 Color Models in Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.3.1 Video Color Transforms . . . . . . . . . . . . . . . . . . . . . 1084.3.2 YUV Color Model . . . . . . . . . . . . . . . . . . . . . . . . . 1094.3.3 YIQ Color Model . . . . . . . . . . . . . . . . . . . . . . . . . . 1114.3.4 YCbCr Color Model . . . . . . . . . . . . . . . . . . . . . . . . 112

4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5 Fundamental Concepts in Video . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.1 Analog Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.1.1 NTSC Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225.1.2 PAL Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.1.3 SECAM Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.2 Digital Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1265.2.1 Chroma Subsampling . . . . . . . . . . . . . . . . . . . . . . . 1265.2.2 CCIR and ITU-R Standards for Digital Video . . . . . 1285.2.3 High Definition TV (HDTV) . . . . . . . . . . . . . . . . . . 1295.2.4 Ultra-High-Definition TV (UHDTV) . . . . . . . . . . . . 130

5.3 Video Display Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1315.3.1 Analog Display Interfaces . . . . . . . . . . . . . . . . . . . . 1315.3.2 Digital Display Interfaces . . . . . . . . . . . . . . . . . . . . 133

5.4 360� Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1355.4.1 Equirectangular Projection (ERP) . . . . . . . . . . . . . . . 1365.4.2 Other Projections . . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.5 3D Video and TV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1385.5.1 Cues for 3D Percept . . . . . . . . . . . . . . . . . . . . . . . . 1385.5.2 3D Camera Models . . . . . . . . . . . . . . . . . . . . . . . . . 1395.5.3 3D Movie and TV Based on Stereo Vision . . . . . . . 1405.5.4 The Vergence–Accommodation Conflict . . . . . . . . . . 1415.5.5 Autostereoscopic (Glasses-Free) Display Devices . . . 1425.5.6 Disparity Manipulation in 3D Content Creation . . . . 143

Contents xv

5.6 Video Quality Assessment (VQA) . . . . . . . . . . . . . . . . . . . . . 1455.6.1 Objective Assessment . . . . . . . . . . . . . . . . . . . . . . . 1455.6.2 Subjective Assessment . . . . . . . . . . . . . . . . . . . . . . 1465.6.3 Other VQA Metrics . . . . . . . . . . . . . . . . . . . . . . . . 146

5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

6 Basics of Digital Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516.1 Digitization of Sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.1.1 What Is Sound? . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516.1.2 Digitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1526.1.3 Nyquist Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 1556.1.4 Signal-to-Noise Ratio (SNR) . . . . . . . . . . . . . . . . . . 1566.1.5 Signal-to-Quantization-Noise Ratio (SQNR) . . . . . . . 1586.1.6 Linear and Nonlinear Quantization . . . . . . . . . . . . . . 1596.1.7 Audio Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1646.1.8 Audio Quality versus Data Rate . . . . . . . . . . . . . . . 1646.1.9 Synthetic Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . 165

6.2 MIDI: Musical Instrument Digital Interface . . . . . . . . . . . . . . 1676.2.1 MIDI Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 1676.2.2 Hardware Aspects of MIDI . . . . . . . . . . . . . . . . . . . 1716.2.3 Structure of MIDI Messages . . . . . . . . . . . . . . . . . . 1736.2.4 MIDI-to-WAV Conversion . . . . . . . . . . . . . . . . . . . 1776.2.5 General MIDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1776.2.6 MIDI 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

6.3 Quantization and Transmission of Audio . . . . . . . . . . . . . . . . 1786.3.1 Coding of Audio . . . . . . . . . . . . . . . . . . . . . . . . . . 1786.3.2 Pulse Code Modulation . . . . . . . . . . . . . . . . . . . . . . 1786.3.3 Differential Coding of Audio . . . . . . . . . . . . . . . . . . 1806.3.4 Lossless Predictive Coding . . . . . . . . . . . . . . . . . . . 1826.3.5 DPCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1866.3.6 DM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1896.3.7 ADPCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Part II Multimedia Data Compression

7 Lossless Compression Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 1997.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1997.2 Basics of Information Theory . . . . . . . . . . . . . . . . . . . . . . . . 2007.3 Run-Length Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

xvi Contents

7.4 Variable-Length Coding (VLC) . . . . . . . . . . . . . . . . . . . . . . . 2037.4.1 Shannon–Fano Algorithm . . . . . . . . . . . . . . . . . . . . 2047.4.2 Huffman Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 2057.4.3 Adaptive Huffman Coding . . . . . . . . . . . . . . . . . . . . 210

7.5 Dictionary-Based Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2157.6 Arithmetic Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

7.6.1 Basic Arithmetic Coding Algorithm . . . . . . . . . . . . . 2217.6.2 Scaling and Incremental Coding . . . . . . . . . . . . . . . 2247.6.3 Integer Implementation . . . . . . . . . . . . . . . . . . . . . . 2297.6.4 Binary Arithmetic Coding . . . . . . . . . . . . . . . . . . . . 2297.6.5 Adaptive Arithmetic Coding . . . . . . . . . . . . . . . . . . 230

7.7 Lossless Image Compression . . . . . . . . . . . . . . . . . . . . . . . . . 2337.7.1 Differential Coding of Images . . . . . . . . . . . . . . . . . 2337.7.2 Lossless JPEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

7.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

8 Lossy Compression Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2418.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2418.2 Distortion Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2428.3 The Rate-Distortion Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 2438.4 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

8.4.1 Uniform Scalar Quantization . . . . . . . . . . . . . . . . . . 2448.4.2 Nonuniform Scalar Quantization . . . . . . . . . . . . . . . 2478.4.3 Vector Quantization . . . . . . . . . . . . . . . . . . . . . . . . 249

8.5 Transform Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2508.5.1 Discrete Cosine Transform (DCT) . . . . . . . . . . . . . . 2518.5.2 Karhunen–Loève Transform* . . . . . . . . . . . . . . . . . 266

8.6 Wavelet-Based Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2698.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2698.6.2 Continuous Wavelet Transform* . . . . . . . . . . . . . . . 2748.6.3 Discrete Wavelet Transform* . . . . . . . . . . . . . . . . . 277

8.7 Wavelet Packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2888.8 Embedded Zerotree of Wavelet Coefficients . . . . . . . . . . . . . . 289

8.8.1 The Zerotree Data Structure . . . . . . . . . . . . . . . . . . 2908.8.2 Successive Approximation Quantization . . . . . . . . . . 2928.8.3 EZW Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

8.9 Set Partitioning in Hierarchical Trees (SPIHT) . . . . . . . . . . . . 2968.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300

Contents xvii

9 Image Compression Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3019.1 The JPEG Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

9.1.1 Main Steps in JPEG Image Compression . . . . . . . . . 3029.1.2 JPEG Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3119.1.3 A Glance at the JPEG Bitstream . . . . . . . . . . . . . . . 314

9.2 The JPEG 2000 Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3159.2.1 Main Steps of JPEG 2000 Image Compression* . . . . 3169.2.2 Adapting EBCOT to JPEG 2000 . . . . . . . . . . . . . . . 3259.2.3 Region-of-Interest Coding . . . . . . . . . . . . . . . . . . . . 3259.2.4 Comparison of JPEG and JPEG 2000

Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3269.3 The JPEG-LS Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

9.3.1 Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3299.3.2 Context Determination . . . . . . . . . . . . . . . . . . . . . . 3319.3.3 Residual Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 3319.3.4 Near-Lossless Mode . . . . . . . . . . . . . . . . . . . . . . . . 332

9.4 Bi-Level Image Compression Standards . . . . . . . . . . . . . . . . . 3329.4.1 The JBIG Standard . . . . . . . . . . . . . . . . . . . . . . . . . 3329.4.2 The JBIG2 Standard . . . . . . . . . . . . . . . . . . . . . . . . 333

9.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

10 Basic Video Compression Techniques . . . . . . . . . . . . . . . . . . . . . . . 34110.1 Introduction to Video Compression . . . . . . . . . . . . . . . . . . . . 34110.2 Video Compression Based on Motion Compensation . . . . . . . 34210.3 Search for Motion Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 343

10.3.1 Sequential Search . . . . . . . . . . . . . . . . . . . . . . . . . . 34410.3.2 2D Logarithmic Search . . . . . . . . . . . . . . . . . . . . . . 34510.3.3 Hierarchical Search . . . . . . . . . . . . . . . . . . . . . . . . . 347

10.4 H.261 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34910.4.1 Intra-Frame (I-Frame) Coding . . . . . . . . . . . . . . . . . 35110.4.2 Inter-Frame (P-Frame) Predictive Coding . . . . . . . . . 35110.4.3 Quantization in H.261 . . . . . . . . . . . . . . . . . . . . . . . 35210.4.4 H.261 Encoder and Decoder . . . . . . . . . . . . . . . . . . 35310.4.5 A Glance at the H.261 Video Bitstream Syntax . . . . 355

10.5 H.263 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35710.5.1 Motion Compensation in H.263 . . . . . . . . . . . . . . . . 35710.5.2 Optional H.263 Coding Modes . . . . . . . . . . . . . . . . 35810.5.3 H.263+ and H.263++ . . . . . . . . . . . . . . . . . . . . . . . 360

10.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

xviii Contents

11 MPEG Video Coding: MPEG-1, 2, 4, and 7 . . . . . . . . . . . . . . . . . . 36511.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36511.2 MPEG-1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

11.2.1 Motion Compensation in MPEG-1 . . . . . . . . . . . . . . 36611.2.2 Other Major Differences from H.261 . . . . . . . . . . . . 36811.2.3 MPEG-1 Video Bitstream . . . . . . . . . . . . . . . . . . . . 371

11.3 MPEG-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37211.3.1 Supporting Interlaced Video . . . . . . . . . . . . . . . . . . 37411.3.2 MPEG-2 Scalabilities . . . . . . . . . . . . . . . . . . . . . . . 37711.3.3 Other Major Differences from MPEG-1 . . . . . . . . . . 383

11.4 MPEG-4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38511.4.1 Overview of MPEG-4 . . . . . . . . . . . . . . . . . . . . . . . 38511.4.2 Video Object-Based Coding in MPEG-4 . . . . . . . . . 38811.4.3 Synthetic Object Coding in MPEG-4 . . . . . . . . . . . . 40111.4.4 MPEG-4 Parts, Profiles, and Levels . . . . . . . . . . . . . 409

11.5 MPEG-7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41011.5.1 Descriptor (D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41211.5.2 Description Scheme (DS) . . . . . . . . . . . . . . . . . . . . 41411.5.3 Description Definition Language (DDL) . . . . . . . . . . 417

11.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

12 Modern Video Coding Standards: H.264, H.265, and H.266 . . . . . . 42312.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42312.2 H.264 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

12.2.1 Motion Compensation . . . . . . . . . . . . . . . . . . . . . . . 42512.2.2 Integer Transform . . . . . . . . . . . . . . . . . . . . . . . . . . 42812.2.3 Quantization and Scaling . . . . . . . . . . . . . . . . . . . . . 43012.2.4 Examples of H.264 Integer Transform

and Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . 43212.2.5 Intra-Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43512.2.6 In-loop Deblocking Filtering . . . . . . . . . . . . . . . . . . 43612.2.7 Entropy Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 43712.2.8 Context-Adaptive Variable Length Coding

(CAVLC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44012.2.9 Context-Adaptive Binary Arithmetic Coding

(CABAC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44212.2.10 H.264 Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44412.2.11 H.264 Scalable Video Coding (SVC) . . . . . . . . . . . . 44612.2.12 H.264 Multiview Video Coding (MVC) . . . . . . . . . . 447

12.3 H.265 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44812.3.1 Motion Compensation . . . . . . . . . . . . . . . . . . . . . . . 44912.3.2 Integer Transform . . . . . . . . . . . . . . . . . . . . . . . . . . 453

Contents xix

12.3.3 Quantization and Scaling . . . . . . . . . . . . . . . . . . . . . 45412.3.4 Intra-Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45512.3.5 Discrete Sine Transform (DST) . . . . . . . . . . . . . . . . 45512.3.6 In-Loop Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 45712.3.7 Entropy Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 45812.3.8 Special Coding Modes . . . . . . . . . . . . . . . . . . . . . . 45812.3.9 H.265 Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45912.3.10 H.265 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 460

12.4 H.266 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46212.4.1 Motion Compensation . . . . . . . . . . . . . . . . . . . . . . . 46312.4.2 Adaptive Multiple Transforms . . . . . . . . . . . . . . . . . 46512.4.3 Non-separable Secondary Transform . . . . . . . . . . . . 46512.4.4 In-Loop Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . 46812.4.5 Tools for High Dynamic Range (HDR) Video . . . . . 46912.4.6 Tools for 360� Video . . . . . . . . . . . . . . . . . . . . . . . 47112.4.7 H.266 Performance Report . . . . . . . . . . . . . . . . . . . 473

12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

13 Basic Audio Compression Techniques . . . . . . . . . . . . . . . . . . . . . . . 47913.1 ADPCM in Speech Coding . . . . . . . . . . . . . . . . . . . . . . . . . . 480

13.1.1 ADPCM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48013.1.2 G.726 ADPCM, G.727-9 . . . . . . . . . . . . . . . . . . . . . 480

13.2 Vocoders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48313.2.1 Phase Insensitivity . . . . . . . . . . . . . . . . . . . . . . . . . 48313.2.2 Channel Vocoder . . . . . . . . . . . . . . . . . . . . . . . . . . 48413.2.3 Formant Vocoder . . . . . . . . . . . . . . . . . . . . . . . . . . 48513.2.4 Linear Predictive Coding . . . . . . . . . . . . . . . . . . . . . 48613.2.5 CELP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48913.2.6 Hybrid Excitation Vocoders* . . . . . . . . . . . . . . . . . . 495

13.3 Open Source Speech Codecs* . . . . . . . . . . . . . . . . . . . . . . . . 49813.3.1 Speex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49913.3.2 Opus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500

13.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504

14 MPEG Audio Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50514.1 Psychoacoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506

14.1.1 Equal-Loudness Relations . . . . . . . . . . . . . . . . . . . . 50614.1.2 Frequency Masking . . . . . . . . . . . . . . . . . . . . . . . . . 50814.1.3 Temporal Masking . . . . . . . . . . . . . . . . . . . . . . . . . 513

14.2 MPEG Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51514.2.1 MPEG Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51514.2.2 MPEG Audio Strategy . . . . . . . . . . . . . . . . . . . . . . 516

xx Contents

14.2.3 MPEG Audio Compression Algorithm . . . . . . . . . . . 51714.2.4 MPEG-2 AAC (Advanced Audio Coding) . . . . . . . . 52314.2.5 MPEG-4 Audio . . . . . . . . . . . . . . . . . . . . . . . . . . . 525

14.3 Other Audio Codecs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52614.3.1 Ogg Vorbis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526

14.4 MPEG-7 Audio and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . 52814.5 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52914.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530

Part III Multimedia Communications and Networking

15 Network Services and Protocols for MultimediaCommunications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53515.1 Protocol Layers of Computer Communication Networks . . . . . 53515.2 Local Area Network (LAN) and Access Networks . . . . . . . . . 536

15.2.1 LAN Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53815.2.2 Ethernet Technology . . . . . . . . . . . . . . . . . . . . . . . . 53815.2.3 Access Network Technologies . . . . . . . . . . . . . . . . . 540

15.3 Internet Technologies and Protocols . . . . . . . . . . . . . . . . . . . . 54315.3.1 Network Layer: IP . . . . . . . . . . . . . . . . . . . . . . . . . 54315.3.2 Transport Layer: TCP and UDP . . . . . . . . . . . . . . . 54515.3.3 Network Address Translation (NAT)

and Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55015.4 Multicast Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552

15.4.1 Router-Based Architectures: IP Multicast . . . . . . . . . 55215.4.2 Non Router-Based Multicast Architectures . . . . . . . . 554

15.5 Quality of Service (QoS) and Quality of Experience (QoE) . . . 55515.5.1 QoS and QoE for Multimedia Communications . . . . 55615.5.2 Internet QoS Architecture: IntServ and DiffServ . . . . 56015.5.3 Network Softwarization and Virtualization: SDN

and NVF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56415.5.4 Rate Control and Buffer Management . . . . . . . . . . . 565

15.6 Protocols for Multimedia Transmission and Interaction . . . . . . 56715.6.1 HyperText Transfer Protocol (HTTP) . . . . . . . . . . . . 56715.6.2 Real-Time Transport Protocol (RTP) . . . . . . . . . . . . 56915.6.3 RTP Control Protocol (RTCP) . . . . . . . . . . . . . . . . . 57115.6.4 Real-Time Streaming Protocol (RTSP) . . . . . . . . . . . 571

15.7 Case Study: Internet Telephony . . . . . . . . . . . . . . . . . . . . . . . 57315.7.1 Signaling Protocols: H.323 and Session

Initiation Protocol (SIP) . . . . . . . . . . . . . . . . . . . . . 57415.8 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

Contents xxi

15.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580

16 Internet Multimedia Content Distribution . . . . . . . . . . . . . . . . . . . . 58316.1 Proxy Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584

16.1.1 Sliding-Interval Caching . . . . . . . . . . . . . . . . . . . . . 58516.1.2 Prefix Caching and Segment Caching . . . . . . . . . . . 58716.1.3 Rate-Split Caching and Work-Ahead Smoothing . . . . 588

16.2 Content Distribution Networks (CDNs) . . . . . . . . . . . . . . . . . 59216.2.1 Request Routing and Redirection . . . . . . . . . . . . . . . 59216.2.2 Representative: Akamai Streaming CDN . . . . . . . . . 594

16.3 Broadcast/Multicast Video Distribution . . . . . . . . . . . . . . . . . . 59616.3.1 Smart TV and Set-Top Box (STB) . . . . . . . . . . . . . . 59616.3.2 Scalable Broadcast/Multicast VoD . . . . . . . . . . . . . . 59816.3.3 Multi-rate Broadcast/Multicast for Heterogeneous

Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60216.4 Application-Layer Multicast and Peer-to-Peer Streaming . . . . . 605

16.4.1 Application-Layer Multicast Tree . . . . . . . . . . . . . . . 60516.4.2 Representative: End-System Multicast (ESM) . . . . . . 60616.4.3 Peer-to-Peer Mesh Overlay . . . . . . . . . . . . . . . . . . . 60816.4.4 Representative: CoolStreaming . . . . . . . . . . . . . . . . 610

16.5 Web-Based Media Streaming . . . . . . . . . . . . . . . . . . . . . . . . . 61316.5.1 Dynamic Adaptive Streaming over

HTTP (DASH) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61516.5.2 Common Media Application Format (CMAF) . . . . . 61716.5.3 Web Real-Time Communication (WebRTC) . . . . . . . 619

16.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 621References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

17 Multimedia Over Wireless and Mobile Networks . . . . . . . . . . . . . . 62717.1 Characteristics of Wireless Channels . . . . . . . . . . . . . . . . . . . 627

17.1.1 Path Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62717.1.2 Multipath Fading . . . . . . . . . . . . . . . . . . . . . . . . . . 628

17.2 Wireless Networking Technologies . . . . . . . . . . . . . . . . . . . . . 63017.2.1 Cellular Wireless Mobile Networks: 1G–5G . . . . . . . 63117.2.2 Wireless Local Area Networks (WLANs) . . . . . . . . . 64017.2.3 Bluetooth and Short-Range Technologies . . . . . . . . . 643

17.3 Multimedia Over Wireless Channels . . . . . . . . . . . . . . . . . . . . 64417.3.1 Error Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64517.3.2 Error Correction . . . . . . . . . . . . . . . . . . . . . . . . . . . 64817.3.3 Error-Resilient Coding . . . . . . . . . . . . . . . . . . . . . . 65217.3.4 Error Concealment . . . . . . . . . . . . . . . . . . . . . . . . . 657

xxii Contents

17.4 Mobility Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66017.4.1 Network Layer Mobile IP . . . . . . . . . . . . . . . . . . . . 66117.4.2 Link-Layer Handoff Management . . . . . . . . . . . . . . 663

17.5 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66517.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

18 Cloud Computing for Multimedia Services . . . . . . . . . . . . . . . . . . . 67118.1 Cloud Computing Overview . . . . . . . . . . . . . . . . . . . . . . . . . 672

18.1.1 Representative Storage Service: Amazon S3 . . . . . . . 67618.1.2 Representative Computation Service:

Amazon EC2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67818.2 Multimedia Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . 68118.3 Multimedia Content Sharing over Cloud . . . . . . . . . . . . . . . . . 681

18.3.1 Impact of Globalization . . . . . . . . . . . . . . . . . . . . . . 68418.3.2 Case Study: Netflix . . . . . . . . . . . . . . . . . . . . . . . . . 685

18.4 Multimedia Computation Offloading . . . . . . . . . . . . . . . . . . . . 68718.4.1 Requirements for Computation Offloading . . . . . . . . 68818.4.2 Service Partitioning for Video Processing . . . . . . . . . 689

18.5 Interactive Cloud Gaming . . . . . . . . . . . . . . . . . . . . . . . . . . . 69018.5.1 Workload and Delay in Cloud Gaming . . . . . . . . . . 69118.5.2 Implementation and Deployment . . . . . . . . . . . . . . . 694

18.6 Edge Computing and Serverless Computingfor Multimedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69618.6.1 Mobile Edge Computing . . . . . . . . . . . . . . . . . . . . . 69618.6.2 Serverless Computing for Video Processing . . . . . . . 698

18.7 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70018.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702

Part IV Human-Centric Interactive Multimedia

19 Online Social Media Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70719.1 Representatives of Social Media Services . . . . . . . . . . . . . . . . 708

19.1.1 User-Generated Content (UGC) . . . . . . . . . . . . . . . . 70819.1.2 Online Social Networking (OSN) . . . . . . . . . . . . . . . 709

19.2 User-Generated Media Content Sharing . . . . . . . . . . . . . . . . . 71019.2.1 YouTube Video Format and Meta-Data . . . . . . . . . . 71019.2.2 Characteristics of YouTube Video . . . . . . . . . . . . . . 71119.2.3 Small-World in YouTube Videos . . . . . . . . . . . . . . . 71419.2.4 YouTube from a Partner’s View . . . . . . . . . . . . . . . 71619.2.5 Crowdsourced Interactive Livecast . . . . . . . . . . . . . . 719

19.3 Media Propagation in Online Social Networks . . . . . . . . . . . . 722

Contents xxiii

19.3.1 Sharing Patterns of Individual Users . . . . . . . . . . . . 72319.3.2 Video Propagation Structure and Model . . . . . . . . . . 72419.3.3 Video Watching and Sharing Behaviors . . . . . . . . . . 727

19.4 Mobile Video Clip Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . 72719.4.1 Mobile Interface Characteristics . . . . . . . . . . . . . . . . 72819.4.2 Video Clip Popularity . . . . . . . . . . . . . . . . . . . . . . . 72919.4.3 Lifespan and Propagation . . . . . . . . . . . . . . . . . . . . 729

19.5 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73119.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733

20 Augmented Reality and Virtual Reality . . . . . . . . . . . . . . . . . . . . . 73720.1 Defining Augmented Reality and Virtual Reality . . . . . . . . . . . 73720.2 Workflow of Augmented Reality . . . . . . . . . . . . . . . . . . . . . . 740

20.2.1 Sensory Data Collection . . . . . . . . . . . . . . . . . . . . . 74020.2.2 Localization and Alignment . . . . . . . . . . . . . . . . . . . 74120.2.3 World Generation and Emission . . . . . . . . . . . . . . . 741

20.3 Early Foundational Systems and Applications . . . . . . . . . . . . . 74220.4 Enabling Hardware and Infrastructure . . . . . . . . . . . . . . . . . . . 745

20.4.1 Graphics Processing Unit (GPU) . . . . . . . . . . . . . . . 74520.4.2 Global Positioning System (GPS) . . . . . . . . . . . . . . 74720.4.3 Networking for Multiple Users . . . . . . . . . . . . . . . . 748

20.5 Modern Augmented Reality Systems and Applications . . . . . . 74920.6 Limitations and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . 752

20.6.1 Color Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . 75320.6.2 Depth Perception . . . . . . . . . . . . . . . . . . . . . . . . . . 75420.6.3 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75520.6.4 Information Presentation . . . . . . . . . . . . . . . . . . . . . 75620.6.5 Social Acceptance . . . . . . . . . . . . . . . . . . . . . . . . . . 756

20.7 Further Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75720.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 758

21 Content-Based Retrieval in Digital Libraries . . . . . . . . . . . . . . . . . 76321.1 How Should We Retrieve Images? . . . . . . . . . . . . . . . . . . . . . 76321.2 Synopsis of Early CBIR Systems . . . . . . . . . . . . . . . . . . . . . . 76621.3 C-BIRD—An Early Experiment . . . . . . . . . . . . . . . . . . . . . . . 768

21.3.1 Color Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . 76821.3.2 Color Density and Color Layout . . . . . . . . . . . . . . . 77021.3.3 Texture Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77021.3.4 Search by Illumination Invariance . . . . . . . . . . . . . . 77321.3.5 Search-by-Object Model . . . . . . . . . . . . . . . . . . . . . 774

21.4 Quantifying Search Results . . . . . . . . . . . . . . . . . . . . . . . . . . 77721.5 Key Technologies in Current CBIR Systems . . . . . . . . . . . . . 780

xxiv Contents

21.5.1 Robust Image Features and TheirRepresentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 780

21.5.2 User Feedback and Collaboration . . . . . . . . . . . . . . 78221.5.3 Other Post-processing Techniques . . . . . . . . . . . . . . 78321.5.4 Visual Concept Search . . . . . . . . . . . . . . . . . . . . . . 78421.5.5 Feature Learning with Convolutional Neural

Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78521.5.6 Database Indexing . . . . . . . . . . . . . . . . . . . . . . . . . 788

21.6 Querying on Videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78921.7 Querying on Videos Based on Human Activity—A Case

Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79221.7.1 Modeling Human Activity Structures . . . . . . . . . . . . 79321.7.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 795

21.8 Quality-Aware Mobile Visual Search . . . . . . . . . . . . . . . . . . . 79521.8.1 Quality-Aware Method . . . . . . . . . . . . . . . . . . . . . . 79821.8.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 799

21.9 Deep Incremental Hashing Network* . . . . . . . . . . . . . . . . . . . 80021.9.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . 80121.9.2 Descriptions of DIHN . . . . . . . . . . . . . . . . . . . . . . . 80121.9.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 804

21.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 811

Contents xxv