Download - A New Implementation of High Resolution Video Encoding Using the HEVC Standard

Transcript

A New Implementation of High Resolution

Video Encoding Using the HEVC Standard

Alaa F. Eldeken1, Mohamed M. Fouad1,Gouda I. Salama1, and Aliaa A. Youssif2

1 Dept. of Computer Engineering, Military Technical College, Cairo, EGYPT2 Faculty of Computer & Information, Helwan University, Helwan, Cairo, EGYPT

{eldeken,mmafoad,drgouda,ayoussif}@ieee.org

Abstract. In this paper, the implementation method for encoding thehigh resolution videos using high efficiency video coding (HEVC) stan-dard is introduced with a new approach. The HEVC standard, successorto the H.264/AVC standard, is more efficient than the H.264/AVC stan-dard in the encoding high resolution videos. HEVC has been designedto focus on increasing video resolution and increasing the use of parallelprocessing architectures. Therefor, this approach merging all traditionalconfiguration files used in the encoding process into only one configura-tion file without removing any parameters used in the traditional meth-ods. Improvements are shown using the proposed approach in terms ofencoding time as opposed to the traditional methods by reducing theaccess time by half which resulting from reducing the data exchange be-tween the configuration files used in this process and without changingthe rate-distortion (RD) performance or compression ratio.

Keywords: High efficiency video coding, HEVC implementation, rate-distortion.

1 Introduction

HEVC is a new video coding standard recently launched in 2013 in order to savethe channel bandwidth and disk space as opposed to the standard H.264/AVC. Itis also known as H.265 or MPEG-H Part-2 [1]. HEVC has been designed to focuson two key issues : increasing video resolution and increasing the use of parallelprocessing architectures. HEVC provides 50% more bit-rate reduction and ahigher degree of parallelism when compared to H.264/AVC by adopting a varietyof coding efficiency enhancement and parallel processing tools [2]. Typically, theH.264/AVC [3] divides a frame into 16×16 fixed size of macroblocks. However,this fixed size limits the ability of the H.264/AVC to encode/decode the highresolution videos. Contrarily, in HEVC, a frame is divided into coding tree units(CTU) of 16×16, 32×32 or 64×64. Each CTU can be further divided into smallerblocks, called coding units (CUs), using a quadtree structure. Each CU can befurther split into either prediction units (PUs) or transform unit (TUs) usingthe quadtree structure. The size of each TU, used in the prediction error coding,

c© Springer International Publishing Switzerland 2015 163R. Silhavy et al. (eds.), Intelligent Systems in Cybernetics and Automation Theory,Advances in Intelligent Systems and Computing 348, DOI: 10.1007/978-3-319-18503-3_16

164 A.F. Eldeken et al.

is ranged from 4×4 upto 32×32 leading to larger transformations than that ofthe H.264/AVC that only uses 4×4 and 8×8 transforms [4]. In turn, the highresolution videos can be encoded using the HEVC more efficiently than that ofthe H.264/AVC standard [2].

The HM (High Efficiency Video Coding (HEVC) Test Model) software is thereference software for the HEVC project of the video sequences [5]. The HM soft-ware is written in C++ and is provided as source code. Since the HEVC project isstill under development, the HM software is also under development and changesfrequently. The HM software folders contains all files that are required for build-ing and running the software( i.e., the folders bin and lib are created duringbuilding the software) [6]. A log file describing the changes (i.e., main changes)for HM software is the configuration file, which changes only the main changesin the encoding process but any other not existing in the configuration file isdone directly into c++ source code and need to understand the all source codemodules [7]. It is possible to build the HM software on a Windows 32 platformwith microsoft visual studio.net and on a linux platform with gcc version 4. Sincethe HM software is written in C++, it should also be possible to build the soft-ware on other platforms, which provide a C++ compiler. All libraries are staticlibraries and all executable are statically linked to the libraries. The folder buildcontains a microsoft visual studio.net workspace and videoencdec.sln. In orderto build the software, this workspace is opened with microsoft visual studio.net,and all project files are built [7]. This paper is organized as follows. The usageand configuration of the HM software and the proposed encoding method areshown in Section 2 and Section 3, respectively. Experimental results are shownin Section 4. Finally, conclusions are given in Section 5.

2 Usage and Configuration of The HM Software

In this section, information on usage the HM software and setup the configura-tion files in the HM software package are discussed. The libraries provided bythe HM software [7] are descried as follows: (i)TAppCommon, provides classesthat are used by both the encoder and decoder, as for example macroblock datastructures, buffers for storing and accessing video data, or algorithms for de-blocking. (ii)TLibEncoder, provides classes that are only used by the encoder.For example, it includes classes for motion estimation, mode decision, and en-tropy encoding. (iii)TLibDecoder, provides classes that are only used by thedecoder. For example, it includes classes for entropy decoding and bitstreamparsing. (iv)TLibVideoIO, provides classes for reading and writing NAL units inthe byte-stream format as well as classes for reading and writing raw video data.Our work will focus on TAppCommon, TLibEncoder and TLibDecoder that isdiscussed in Section 2.1 and Section 2.2 respectively.

2.1 Encoder of The HM Software

The encoder can be used for generating HEVC bitstreams and reconstructedfiles [6]. The basic encoder file is illustrated in Fig. 1-(a) that represents the

A New Impl. of High Resolution Video Encoding Using the HEVC 165

filename of the executable file (i.e., TAppEncoder.exe) , the main configurationfile (i.e., encoder random-access-maim.cfg), the video configuration file (i.e., Ten-nis.cfg), and the output file (i.e., log-RA.txt), respectively. The -C parameter inthe basic encoder file defines the configuration file to be used. Multiple configu-ration files may be used with repeated c options.

(a) The basic encoder file [7]

(b) The basic decoder file [7]

Fig. 1. The basic encoder and decoder files

In Fig. 2-(a) the main configuration file (i.e., encoder random-access-maim.cfg)is shown that describes the method that is used in the coding process. Thismethod can be one of three methods: All Intra method, random-access method,and low-delay method [6] as shown in Fig. 3-(a), Fig. 3-(b), and Fig. 3-(c),respectively.

In Fig. 3-(a) graphical presentation of all-Intra configuration is shown. Eachpicture in a video sequence will be encoded as instantaneous decoder refresh(IDR) picture. An encoder sends an IDR coded picture to clear the contentsof the reference picture buffer. On receiving an IDR coded picture, the decodermarks all pictures in the reference buffer as unused for reference. All subsequenttransmitted slices can be decoded without reference to any frame decoded priorto the IDR picture. The first picture in a coded video sequence is always anIDR picture. It is not allowed to change quantization parameter (QP) during asequence within a picture [6].

In Fig. 3-(b) the random-access graphical presentation is shown. Each hierar-chical bidirectional (B) structure will be used for encoding process. Intra picturewill be inserted cyclically per about one second. The first intra picture of a videosequence will be encoded as IDR picture and the other intra pictures will be en-coded as non-IDR intra pictures [6]. The pictures located between successiveintra pictures in display order will be encoded as B-pictures. The second andthird temporal layers consists of referenced B pictures, and the highest temporallayer contains non-referenced B picture only. QP of each inter coded picture willbe derived by adding offset to QP of Intra coded picture depending on temporallayer.

In Fig. 3-(c) low-delay graphical presentation is shown. Only the first picturein a video sequence will be encoded as IDR picture. In mandatory low-delay(B low-delay) test condition (i.e., two kinds of low-delay coding configurations,bidirectional (B) low-delay and predictive (P) low-delay), the other successivepictures will be encoded as generalized P and B-picture (GPB) [6].

166 A.F. Eldeken et al.

Some of the parameters used in the main configuration file [7] for the HEVCwill be discussed bellow: (1)OutputFile, specifies the filename of the bitstreamto be generated. (2)ReconFile, specifies the filename of the coded and recon-structed input sequence. This sequence is provided for debugging purposes. Itwill be automatically created by the encoder. (3)BasisQP, specifies the basicquantization parameter. This parameter shall be used to control the bit-rate ofa bitstream. (4)GOPSize, specifies the group of pictures (GOP) size that shallbe used for encoding a video sequence. A GOP consists of an anchor pictureand several hierarchically coded B pictures that are located between the anchorpictures. (5)LoopFilterDisable, specifies how the in-loop deblocking filter is ap-plies. The following values are supported [8]: i) The deblocking filter is appliedto all block edges. ii) The deblocking filter is not applied. iii) The deblockingfilter is applied to all block edges with exception of slice boundaries. (6)Loop-FilterTcOffset, specifies the Tc offset for the deblocking filter. (7)LoopFilterBe-taOffset, specifies the beta offset for the deblocking filter. LoopFilterBetaOffsetand LoopFilterTcOffset shall be the integer values in the range of -6 to 6 [9].This parameter can be used to adjust the strength of the deblocking filter [10].

In Fig. 2-(b) the video configuration file (i.e., Tennis.cfg) that gives all char-acteristics for the video used in the encoding process is shown. Some of thesescharacteristics are described bellow [7]: (1)InputFile, specifies the filename of theoriginal raw video sequence to be encoded. The input files should have the formatyuv. (2)SourceWidth, specifies the width of the input videos in luma samples.SourceWidth shall be non-zero. This parameter shall be present in each config-uration file, since the default value of 0 is invalid. (3)SourceHeight, specifies theheight of the input videos in luma samples. SourceHeight shall be non-zero. Thisparameter shall be present in each configuration file, since the default value of0 is invalid. (4)FrameRate, specifies the frame rate of the input sequence in Hz.(5)FramesToBeEncoded, specifies the number of frames of the input sequence tobe encoded for one view. In Fig. 2-(c) the output file (i.e., log-RA.txt) that givesthe the output results after encoding (i.e., bit-rate, YUV values, and encodingtime).

2.2 Decoder of The HM Software

The basic deccoder file is illustrated in Fig. 1-(b) that represents the filenameof the executable file (i.e., TAppDecoder.exe) , the bitstream file that specifiesthe filename of the bitstream to be decoded (i.e., strRA.str), and the recon-structed file that specifies the filename for the reconstructed video sequence(i.e., decRA.yuv), respectively. The -b and -o in the basic decoder file Specifiesthe output coded bit stream file and the output locally reconstructed video file,respectively [7]. The -d parameter is also specifies the luma internal bit-depth ofthe reconstructed YUV file (i.e., internal bit-depth is equal to 8 or 10).

A New Impl. of High Resolution Video Encoding Using the HEVC 167

(a) The main configuration file (i.e., encoder random-access-maim.cfg) [7]

(b) The video configuration file (i.e., Tennis.cfg) [7]

(c) the output file (i.e., log-RA.txt) [7]

Fig. 2. (a-c) The configuration files used in the coding process [7]

168 A.F. Eldeken et al.

(a) Graphical presentation of all-Intra configuration [6]

(b) Graphical presentation of Random-access configuration [6]

(c) Graphical presentation of Low-delay configuration [6]

Fig. 3. (a-c) Graphical presentation for the coding methods

A New Impl. of High Resolution Video Encoding Using the HEVC 169

3 The Proposed Approach

In the proposed approach the two configuration files (i.e., random-access configu-ration file and Tennis configuration file) that present a collection of configurationparameters in the HEVC standard is called by the basic encoder file to start theencoding process. The first one describes the video coding method and the laterdescribes the characteristics of the video that is used in the coding process. Theaccess time for the basic encoder file to call and perform the two configurationfiles can be reduced by a half by merging the two configuration files into only onefile. Each configuration file has it’s own parameters which has a default values.So, when the configuration parameters is not present in the configuration file, thedefault values is taken instead. So, if we need to merge these two files into onefile, we must first change these parameters calls in the C++ source code keepingall the functions related to theses parameters work well without any problems.Therefor, we modify all configuration parameters in the configuration files in theC++ source code to become suitable for calling from the new configuration file.The new configuration file is shown in Fig. 4.

(a) The new encoding file of the proposed approach

(b) The new configuration file of the proposed approach

Fig. 4. The new configuration files of the proposed approach

170 A.F. Eldeken et al.

Table 1. Description of data sequences used

Seq.Seq. Name

File Frame # ofClass Resolution

Color# size rate Frames format

(MB) (fps)

1 PeopleOnStreet 555 30 150A 2560×1600

4:2:0

2 Traffic 563 30 1503 Train 1714 60 3004 BasketBallDrive 989 50 300

B 1920×10805 Tennis 389 50 2406 ParkScene 475 24 2407 BasketBallDrill 20 50 200

C 832×4808 Keiba 97 30 5009 BQMall 25 60 20010 RaceHorses 32 60 300

D 416×24011 BQSquare 64 60 60012 BasketBallPass 46 50 50013 Vidyo1 449 60 600

E 1280×72014 SlideShow 88 20 50015 Vidyo3 416 60 600

Table 2. Encoding time comparison (hours)

Seq.Class Seq. Name

Traditional approach Proposed approach# (hours) (hours)

1A

PeopleOnStreet 5.33 5.212 Traffic 3.82 3.753 Train 8.72 8.644

BBasketBallDrive 8.05 7.97

5 Tennis 6.80 6.766 ParkScene 3.23 3.127

CBasketBallDrill 1.50 1.44

8 Keiba 1.53 1.519 BQMall 1.72 1.6610

DRaceHorses 0.27 0.25

11 BQSquare 0.40 0.3712 BasketBallPass 0.41 0.36

In Fig. 4-(a) the new basic encoder file is illustrated that represents the file-name of the executable file (i.e., TAppEncoder.exe), the main configuration andvideo file (i.e., encoder random-access-maim-Tennis.cfg), and the output file(i.e., log-RA.txt), respectively. In Fig. 4-(b) the new configuration file that isused in the encoding process contains both the encoding method besides thevideo characteristics.

A New Impl. of High Resolution Video Encoding Using the HEVC 171

4 Experimental Results

In this section data sequences, the implementation setup of experiments, andresults are discussed. The data sets used in the experiments include five classes ofreal sequences [11]. Each class has three video sequences having different featureswith characteristics as shown in Table 1. Our implementation runs on Intel Corei5 with 4GB of RAM. The proposed approach (i.e., referred to as Proposed) iscompared to the traditional approach (i.e., referred to as traditional) [6]. We usethe HEVC standard software (HM10) [5] for encoding/decoding the data setsmentioned above. In this paper, the performance of competing approaches isevaluated by i) the rate-distortion (RD) (in dB/Kbps), ii) the Bjontegaard (BD)rate ratio [12], iii) the compression ratio between the decoded video sequence andits original using the two approaches for all data sets, and iv) the encoding time.The quantization parameter (QP) is set to 22, 27, 32, and 37 [13]. The groupof picture (GOP) is set to 8. The coding method is random-access configuration(i.e., class-E is skipped) [13] .

It worth noting that our implementation Proposed achieves the same resultof the traditional when compared in terms of all metrics mentioned above atdifferent QPs (22, 27, 32, 37), using the whole number of frames of all videosequences described above with the two competing approaches with using onlyone configuration file for the encoding process instead of using the two traditionalconfiguration files for this process. Improvements are shown using the proposedapproach in terms of encoding time as opposed to the traditional approachesby The Proposed approach surpasses the traditional approaches in terms ofencoding time by a maximum decrease of 10% in class-D at QP equal to 32 asshown in Table 2.

5 Conclusions

In this paper, a new approach is introduced to implement the encoding processof high resolution videos using the HEVC standard with merging the two tradi-tional configuration files into one configuration file. This modification is based oncollecting all the parameters needed in the encoding process and the video char-acteristics into one file. Improvements are shown using the proposed approachin terms of encoding time as opposed to the traditional approaches by reducingthe access time by half that resulting from reducing the data exchange betweenthe configuration files. There is no change in the rate-distortion and compressionratio.

References

1. Bossen, F., Bross, B., Suhring, K., Flynn, D.: HEVC complexity and implementa-tion analysis. IEEE Trans. on Cir. and Sys. for Video Tech. 22(12) (2012)

2. Sullivan, G., Ohm, J.-R., Han, W.-J., Wiegand, T.: Overview of the high efficiencyvideo coding (HEVC) standard. IEEE Trans. on Cir. and Sys. for Video Tech.,22(12) (December 2012)

172 A.F. Eldeken et al.

3. Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of theH.264/AVC video coding standard. IEEE Trans. on Cir. and Sys. for VideoTech. 13(7) (July 2003)

4. Hsia, S.-C., Hsu, W.-C., Lee, S.-C.: Low-complexity high-quality adaptive deblock-ing filter for H.264/AVC system. Signal Processing: Image Communication 27,749–759 (2012)

5. Online (2012),http://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches

6. Kim, I.-K., McCann, K., Sugimoto, K., Bross, B., Han, W.-J.: High efficiency videocoding (HEVC) test model draft 10 (HM 10) encoder description. Technical ReportDoc. JCTVC-L1002, JCT-VC, Geneva, Switzerland (January 2013)

7. Bossen, F., Flynn, D., Suhring, K.: HM Software Manual. ITU-T SG16 WP3 andISO/IEC JTC1/SC29/WG11, Geneva, Switzerland (January 2013)

8. List, P., Joch, A., Lainema, J., Bjontegaard, G., Karczewicz, M.: Adaptive deblock-ing filter. IEEE Trans. on Cir. and Sys. for Video Tech. 13(7), 614–619 (2003)

9. Norkin, A., Bjontegaard, G., Fuldseth, A., Narroschke, M.: HEVC deblocking filter.IEEE Trans. on Cir. and Sys. for Video Tech. 22(12) (December 2012)

10. Lou, J., Jagmohan, A., He, D., Lu, L., Sun, M.-T.: H.264 deblocking speedup.IEEE Trans. on Cir. and Sys. for Video Tech. 19(8) (2009)

11. Online (2003), ftp://hvc:[email protected]/testsequences12. Bjontegaard, G.: Calculation of average PSNR differences between RD-curves.

VCEG-M33, Texas, USA (April 2001)13. Bossen, F.: Common test conditions and software reference configurations. JCTVC-

D600, Daegu, KR, U.S.A. (January 2011)