Robust HDR Video Watermarking with HVS and T-QR

Multimedia Tools and Applications (2022) 81:33375–33395 https://doi.org/10.1007/s11042-022-13145-y Robust HDR video watermarking method based on the HVS model and T-QR Meng Du 1,2 & Ting Luo 1,2 2 1 3 & Haiyong Xu & Yang Song & Chunpeng Wang & Li Li 4 Received: 16 February 2021 / Revised: 16 June 2021 / Accepted: 10 April 2022 / Published online: 18 April 2022 # The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract In order to protect the copyright of the high dynamic range (HDR) video, a robust HDR video watermarking method based on the human visual system (HVS) model and Tensor-QR decomposition (T-QR) is proposed. In order to obtain the main information of the HDR video, the key frames are extracted by detecting the scene change, and each key frame is considered as the third-order tensor for preserving its main characteristics. Each key frame is divided into non-overlapping blocks, and each block is decomposed by using T-QR to obtain the orthogonal tensor, which includes three matrices and represents main energies of the frame. Since the second matrix has more correlations of the frame than other two matrices, it is used to embed watermark for robustness. Moreover, to obtain the trade-off between watermarking robustness and the visual quality, the HVS model of each key frame is computed by using the luminance perception, image contrast and contrast masking for determining the watermark embedding strength. Experiment results show that the proposed method can resist a variety of tone mapping and video attacks, and is more robust than existing watermarking methods. Keywords HDR video . The HVS model . T-QR . Robust watermarking 1 Introduction Different from the traditional low dynamic range (LDR) video, the high dynamic range (HDR) video can describe the luminance and color of the real-world. However, the HDR video cannot * Ting Luo luoting@nbu.edu.cn 1 College of Science and Technology, Ningbo University, Ningbo 315212, China 2 Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China 3 School of Information, Qilu University of Technology (School Academy of Sciences), Jinan 250353, China 4 School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China 33376 Multimedia Tools and Applications (2022) 81:33375–33395 be directly displayed by using the currently available display and print devices, because of the wide range of luminance [10]. The HDR video is usually converted to the LDR video through tone mapping operators (TMO), and then displayed on the currently available display and print devices without losing too much detail information [9, 21, 22]. Therefore, how to protect the copyright of the HDR video after TMO has become an urgent problem to be solved. Digital steganography is an information security mechanism, which is used to hide secret data during a communication session by embedding the secret message into the multimedia carrier, and only the sender and recipient know the existence of the secret data [1]. In order to prevent secret information from being discovered by an adversary, digital steganography tries to obtain high invisibility and high embedding capacity. However, digital steganography is lack of robustness, and cannot protect the copyright of the HDR video. The watermarking technology can provide an efficient solution to the copyright protection [24, 32, 39]. Watermarking can be classified into spatial domain and transform domain based watermarking methods. The watermarking method based on spatial domain directly modifies pixels to embed watermark, such as least significant bit (LSB) replacement [41]. Since this type of the watermarking method is sensitive to any modification, it is widely used in the image content authentication. However, pixels of the HDR video are floating point values, and it is difficult to modify the pixels directly, because different HDR videos have different luminance ranges. Recently, some HDR watermarking methods convert the HDR image to the other format, such as RGBE format [8, 37], LogLuv format [25], OpenEXR format [26], and so on, which may lose details of the videos. Cheng et al converted the floating point format of the HDR image to RGBE format, and watermark was embedded by using LSB without the capability of resisting image attacks. Compared with the spatial domain based watermarking method, the transform domain based watermarking method is more robust and can resist a variety of malicious and nonmalicious attacks [14, 18, 27]. Many transforms were used for watermarking, such as Discrete Cosine Transform (DCT) [20], Discrete Wavelet Transform (DWT) [6], Singular Value Decomposition (SVD) [11], QR Decomposition [7] and so on. Nouioua et al used shot boundary detection to select the fast motion frame, and then utilized SVD and Multiresolution-SVD (MR-SVD) to embed watermark [33]. Sang et al randomly selected the video frames from the original video, and watermark was embedded into the approximation sub-band of the frame transformed by using DCT and DWT [34]. However, the randomly selected frames cannot represent the main information of the video, and they may be destroyed after encoding. Thus, this kind of video watermarking is not able to resist the encoding attack. In order to obtain the main information of the video, Ponni et al utilized Fibonacci to select the key frames, and then embedded watermark into the key frames by using SVD and DWT [2]. Since dynamic luminance range of the HDR video are different from the LDR video, the key frames extracting technique of the LDR video cannot be directly applied to the HDR video, otherwise the key frames of the HDR video may be obtained ineffectively. Besides resisting the video attacks, the HDR video watermarking method should resist special HDR image processing, such as TMO. Bakhsh et al utilized artificial bee colony to select the best block, and watermark was embedded into the approximation sub-band of DWT in each selected block [5]. Wu et al transformed the HDR images into the LDR images by using the special TMO for embedding watermark into its DCT domain, but that method was only robust to the pre-determined TMO [36]. Maiorana et al applied the logarithm, DWT and Radon-DCT to the luminance component of the HDR images, and then employed the quantization index modulation (QIM) to embed watermark. However, the average bit error Multimedia Tools and Applications (2022) 81:33375–33395 33377 rate (BER) reached 20%, and corresponding watermarking robustness was not high [28]. Yu et al used the modified specular free to compute the luminance mask of the HDR image, which was used to guide watermark embedding into the low-luminance areas of the HDR image, but the method did not consider human visual perception characteristics of HDR image [38]. Above watermarking methods are designed for the HDR images, and the HDR video watermarking method is still in its infancy. Unlike the HDR image watermarking method, the HDR video watermarking method not only resists image attacks and TMOs, but also resists video attacks. In order to design a robust watermarking method for the HDR video, the multidimensional transformation is required to maintain strong correlations of three channels. Since Tensor-QR Decomposition (T-QR) [13] can combine three channels as the third-order tensor, main strong characteristics can be preserved for robustness. Watermarking robustness can be also enhanced when the embedding strength is increased, but it will lead to the visual distortion of the watermarked image. Thus, watermarking imperceptibility and robustness are contradictory, and in order to obtain the trade-off between watermarking imperceptibility and robustness, watermarking methods based on human visual perception were studied [4, 12, 15, 23, 35, 40]. Hu et al used the contrast sensitivity function (CSF) to obtain the perceptual weight of the embedding strength for guiding robust watermark embedding with low image distortion [15]. Lai et al utilized visual entropy and edge entropy to select the optimal embedding regions, which can decrease visual distortion [23]. Zhang et al combined visual saliency and contourlet transform to determine the quantization step-size, which was modified to embed watermark for enhancing the watermarking imperceptibility and robustness [40]. However, above visual perception models were designed for the LDR content, and they cannot perceive human visual system (HVS) characteristics for the wide range of luminance in the HDR content. Thus, special visual perception model should be designed for the HDR content to embed watermark. In order to balance the watermarking imperceptibility and robustness, Guerrini et al utilized luminance perception mask, activity perception mask and edge perception to compute the perceptual mask of the HDR image for DWT domain, but the method was not robust to many attacks since the average BER was only 29% [12]. In order to optimize the robustness and imperceptibility, Bai et al designed the hierarchical embedding intensity and hybrid perceptual mask to embed watermark for optimizing the watermarking imperceptibility and robustness [4]. Vassilios et al utilized the wavelet transform of the just noticeable difference (JND)-scaled space of the HDR image as the embedding domain, and the CSF was employed to modulate the watermark embedding strength. However, the embedding capacity was low [35]. Moreover, above HDR watermarking methods only consider the visual characteristics of the HDR image, and the temporal correlation of the HDR video is ignored. In this paper, a robust HDR video watermarking method by using the HVS model and T-QR is proposed. The key frames are extracted by using the scene change detection. Each key frame is divided into non-overlapping blocks, and T-QR is applied to each block for obtaining the orthogonal tensors, which includes the first, second and third matrices. Compared with the other two matrices, the second matrix is more robust, and thus, the second matrix is chosen to embed watermark. Moreover, in order to obtain the trade-off between the watermarking imperceptibility and robustness, the HVS model of the HDR video is computed by using the luminance perception, image contrast and contrast masking, which can be used to determine the embedding strength. The main contributions of the paper are listed as follows. 33378 Multimedia Tools and Applications (2022) 81:33375–33395 1. In order to obtain the main information of the HDR video for embedding watermark, the key frames are extracted by using the scene change detection. 2. Different from traditional transformation operated on one signal channel, T-QR considers each key frame of the HDR video as a whole to be transformed so that strong correlations of each key frame can be preserved for robust watermark embedding. 3. The HVS model of the HDR video is computed by using the luminance perception, image contrast and contrast masking to balance the imperceptibility and robustness of the HDR video watermarking. This paper is organized as follows. Section 2 introduces the background knowledge of related technology. In Section 3, we describe the processes of watermarking embedding and extraction. Section 4 discusses and analyses the experimental results. Finally Section 5 makes a summary. 2 Background In this section, the related background technologies are depicted. The notation will be introduced in order to be easily described later. Variables are shown in italics, such as a, matrices are shown in bold letters, such as A, and higher-order tensors are shown in calligraphic letters, such as A. 2.1 Tensor With the development of the Internet, the multimedia data is developing in a multidimensional direction. Tensor is a form of multi-dimensional data, which can stores a lot of information. P-order tensor can be written as: ð1Þ A ¼ ai1 i2 …ip ∈ℝ l1 l2 …lp ; where l1, l2…lp ∈ ℤ indicates the number of elements in each dimension. Therefore the vector can be considered as the first-order tensor, and the matrix can be considered as the secondorder tensor. Higher-order tensors can be represented by a set of matrices. For example, the third-order tensor can be divided into horizontal slice, lateral slice and frontal slice [31], which are represented as A::k , A:k: and Ak:; , respectively, where k ∈ {1, 2, 3}. 2.2 QR to T-QR QR is an important decomposition in linear algebra, which is suitable for any matrix. It can decompose a matrix into an orthogonal matrix and upper triangular matrix. Suppose A ∈ ℝm × n is an image matrix, and after QR decomposition, A can be decomposed as: A ¼ QR; where Q ∈ ℝm × n is orthogonal matrix and R ∈ ℝn ð2Þ × n is upper triangular matrix. Multimedia Tools and Applications (2022) 81:33375–33395 33379 For higher order QR decomposition, T-QR can be used. Let B∈ℝl1 l2 l3 be the third-order tensor, and B can be defined as: B P ¼ Q R; ð3Þ where Q∈ℝ l1 l1 l3 is orthogonal tensor, R∈ℝ l1 l2 l3 is upper triangular tensor, P∈ℝ l2 l2 l3 is a permutation tensor whose values being 0 or 1, and the permutation tensor P satisfies the formula P T P ¼ P P T ¼ I, wherein I is the unit tensor. Each frame of the HDR video can be considered as the third-order tensor with the size of n × m × 3, where l1 = n, l2 = m and l3 = 3. Therefore, after operating T-QR, the orthogonal tensor and the upper triangular tensor are obtained according to Eq. (3). The orthogonal tensor Q::k consists of three matrices, and they are named as the first, the second and the third matrices when k is equal to 1, 2 and 3, respectively. Let Q1, Q2 and Q3 be the first, second and third matrices, respectively. In related to the watermarking robustness, Q2 is chosen to embed watermark, and the main discussion will be introduced in Section 4.1. 3 Proposed HDR video watermarking method In order to protect the copyright of the HDR video, a robust HDR video watermarking method based on the HVS model and T-QR is proposed. In this section, firstly, key frames of the HDR video are extracted by using the scene change detection. Next, the HVS model is computed based on the luminance perception, image contrast and contrast masking. Then, processes of watermark embedding based on the HVS model and T-QR are illustrated. Finally, processes of watermark extraction are introduced. 3.1 Key frame extraction In order to obtain the main information of the HDR video, key frames of the HDR video are extracted by using the scene change detection. The scene change detection is used to detect the motion scene in the HDR video, and the motion scene can be identified by using the histogram difference. The histogram difference between the HDR video frames is compared with a predefined threshold to determine the key frame. 2 l hð f e Þ−h f eþ1 ; H¼ ∑ ð4Þ e¼1 max hð f e Þ; h f eþ1 where l is the number of the HDR video frame, h(∙) represents the histogram of the HDR video frame, fe represents the HDR video frame, and max(∙) returns the maximum value. If H > θ, the corresponding frame is the key frame, where θ is a threshold. 3.2 HVS model The proposed HVS model is calculated by combining the luminance perception, image contrast and contrast masking, which can be used to determine the embedding strength. The computation of the luminance perception, image contrast and contrast masking is described as follows. 33380 Multimedia Tools and Applications (2022) 81:33375–33395 3.2.1 Luminance perception In order to obtain the luminance perception, a psychophysical study is conducted, namely, eye sensitivity model (ESM) [19]. ESM let the observer adapt to a background illumination for a sufficient amount of time and then the illumination is increased to a level so that the change was just noticeable by the observer. This experiment shows that the luminance range of the HDR content covers the entire range that HVS can see, and the response of human eye in this range is neither always linear nor logarithmic. According to ESM, a non-linear model can be obtained, which shows luminance perception under different luminance range, and the input background luminance is converted to the luminance in JND units. 8 −3:18 log10 ðLÞ < −3:94 > > > 2:18 > −3:94≤log10 ðLÞ < −1:44 < ð0:405log10 ðLÞ þ 1:6Þ ð5Þ −1:44≤log10 ðLÞ < −0:0184 ; log10 ðLÞ−1:345 log10 ðLa Þ > 2:7 > ð ð L Þ þ 0:65 Þ −0:0184≤log ð L Þ < 1:9 0:249log > 10 10 > : log10 ðLÞ−2:205 log10 ðLÞ ≥1:9 where L is the background luminance. 3.2.2 Image contrast Image contrast can be defined in different ways, but it is usually related to variations in image luminance. Global contrast measures image sharpness, and is a fine-scale image feature. A framework for perceptual contrast processing of the HDR images is proposed, namely, contrast perceptual [29]. The luminance value of the image is converted into the physical contrast value, and then is switched into response values of the HVS. The contrast perceptual is computed as follows. Step.1 The HDR image is transformed from luminance domain to physical contrast domain. The logarithmic ratio G is used as a measure of contrast between two pixels. Low-pass contrast is defined as a difference between a pixel and its neighbors at a particular level t of the Gaussian pyramid. Gti; j ¼ log10 Lti =Ltj ; ð6Þ where Lti and Ltj are the luminance values of neighbor pixels i and j, respectively. Step.2 Physical contrast domain is converted into a response of the HVS. T ¼ 54:09288⋅G0:41850 : ð7Þ Multimedia Tools and Applications (2022) 81:33375–33395 33381 3.2.3 Contrast masking The contrast sensitivity function (CSF) describes the relationship between the sensitivity of the eye as a function of spatial frequency and adaptation luminance, which can be computed as: MTFðρÞ −p3 ; CSFðρÞ¼P4 sA ðlÞ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 ð1 þ ðp1 ρÞp2 Þ⋅ 1−e−ðρ=7Þ ð8Þ where ρ is the spatial frequency in cycles per degree, p1, p2, p3 and p4 are the fitted parameters, and sA is the joint photoreceptor luminance sensitivity [30]. The signal dependent noise, NnCSF, can be found from the CSF, and is computed as: N nCSF ¼ 1 MTFðρ; Ea ÞsA ðE a Þ ¼ ; nCSF ½ f ; o CSFðρ; E a Þ ð9Þ where Ea is the adapting luminance, and MTF is the modulation transfer function. BM[f, o] is the activity in the spatial frequency band f and orientation o, which can be computed as: BM ½ f ; o ¼ jBR ½ f ; oj⋅nCSF½ f ; o; ð10Þ where the f-th spatial frequency band and o-th orientation of the steerable pyramid are defined asBR[f, o]. Contrast masking denotes the visibility reduction of one visual feature at the presence of another one. Three components are used to model the contrast masking, wherein the first component is responsible for self-masking, the second component is for masking across orientations and the third component is for masking of adjacent frequency bands. N mask ½ f ; o ¼ q k self n f BM ½ f ; o þ nf !q q k self k xo n f ∑ BM ½ f ; i þ n f BM ½ f ; o þ ; nf n f i¼Onf0g q k xn n f þ1 BM ½ f þ 1; o þ n f −1 BM ½ f −1; o nf ð11Þ where kself, kxo and kxn are the weights, and nf = 2−(f − 1). The HVS model J of the HDR video is computed by summing of the luminance perception, image contrast and contrast masking. J ¼ La þ TþNmask : ð12Þ 3.3 Watermark embedding In this section, the process of watermark embedding is presented as illustrated in Fig. 1. Step.1 Key frames are extracted by using the scene change detection, and each key frame can be regarded as the third-order tensor A, with the size of M × N × 3. A is divided into non-overlapping blocks with the size of 4 × 4 × 3, and each block is denoted as Ks , where s is the index of each block. 33382 Multimedia Tools and Applications (2022) 81:33375–33395 Fig. 1 Watermark embedding Step.2 Perform T-QR on each block. Ks ¼ Qs Rs ðP s Þ−1 : ð13Þ Step.3 The HVS model J is computed by Eq. (12), with the size of 4 × 4, and J is divided into non-overlapping blocks with the size of 4 × 4, denoted as Fs. Step.4 Watermark W is embedded as Qws 2 ð2; 1Þ ¼ avg þ ∂=2 if W ði; jÞ ¼ 1 Qws 2 ð3; 1Þ ¼ avg−∂=2 ð14Þ Qws 2 ð2; 1Þ ¼ avg−∂=2 if W ði; jÞ ¼ 0 ð15Þ Qws 2 ð3; 1Þ ¼ avg þ ∂=2 ( ∂ ¼ ∂max if zði; jÞ ≥v ; ð16Þ ∂ ¼ ∂min if zði; jÞ < v where z = sum(Fs)/16, v = sum(z)/(M/4 × N/4), avg ¼ Qs2 ð2; 1Þ þ Qs2 ð3; 1Þ =2, Qs2 is the second matrix of Qs , z(i, j) is the value of z in location (i, j), W(i, j) is the watermark bit, ∂min and ∂max are the maximum embedding strength and minimum embedding strength, respectively, and sum(•) returns the sum of the matrix. The modified orthogonal tensor Qsw is obtained. Step.5 Perform inverse T-QR on each block. Ks ¼ Qsw Rs ðP s Þ−1 : Step.6 Repeat steps 1 to 5 until all key frames of the HDR video are embedded. ð17Þ Multimedia Tools and Applications (2022) 81:33375–33395 33383 For the grey-level video, a group of frames can be considered as the third-order tensor, which can preserve the temporal correlation of the video. In order to protect the copyright of the greylevel video, a group of frames is divided into non-overlapping blocks, and each block is decomposed by using T-QR to extract Qs2 for embedding watermark. 3.4 Watermark extraction Watermark extraction is the reverse process of watermark embedding as illustrated in Fig. 2. Step.1 Each watermarked key frame are regarded as the third-order tensor A* , which is divided into non-overlapping blocks with the size of 4 × 4 × 3, and each block is denoted as K*s . Step.2 Perform T-QR on each block −1 K*s ¼ Q*s R*s P *s ; ð18Þ *s where Q*s 2 is the second matrix of Q . Step.3 Watermark W∗ is extracted as 8 < W * ði; jÞ ¼ 1 if Q*s ð2; 1Þ ≥Q*s ð3; 1Þ 2 2 : : W * ði; jÞ ¼ 0 if Q*s ð2; 1Þ < Q*s ð3; 1Þ 2 2 Step.4 Repeat steps 1 to 3 until watermark from all key frames is extracted. Fig. 2 Watermark extraction ð19Þ 33384 Multimedia Tools and Applications (2022) 81:33375–33395 4 Experimental results and discussion In order to prove the effectiveness of the proposed HDR video watermarking method, Tibul, Sunrise, Playground and ChristmasTree HDR videos are used to evaluate the performance of the proposed method as illustrated in Fig. 3. 16 types of Tone Mapping (TM) attacks are selected from HDR Toolbox as shown in Table 1. The HDR-VDP-2 metric is used to evaluate the quality of the watermarked HDR video [30], in which V is the imperceptibility index, and is from 0 to 100. HDR-VDP75% and HDRVDP95% are the probabilities of detection in at least 75% and 95% of the images, respectively. High V denotes high visual quality, and high HDR-VDP75% and HDR-VDP95% denote the low visual quality of the watermarked HDR video. BER is used to evaluate the correctness of watermark extraction, which can be expressed as robustness. BER ¼ Nw ; Nt ð20Þ Where Nw and Nt are the number of false watermark bits and the number of total watermark bits, respectively. 4.1 Discussion of Q1, Q2 and Q3 In order to select the most suitable matrix to embed watermark, watermark is embedded into Q1, Q2 and Q3 by using the same way in Section 3.2, and these three methods are named as Proposed-Q1, Proposed-Q2 and Proposed-Q3, respectively. Watermark is embedded into all four HDR videos, respectively, and those HDR videos are attacked by using TM1, TM2, TM4, TM6 and TM13, respectively. Averages BER of three different methods are compared as shown in Table 2, and BERs of Proposed-Q2 are lower than those of Proposed-Q1 and Proposed-Q3, (a) Tibul (b) Sunrise (c) Playground (d) ChristmasTree Fig. 3 HDR videos sequence Multimedia Tools and Applications (2022) 81:33375–33395 33385 Table 1 16 types of TM attacks TMs Name TMs Name TM1 TM3 TM5 TM7 TM9 TM11 TM13 TM15 ChiuTMO BanterleTMO ExponentialTMO KimKautzConsistentTMO LogarithmicTMO NormalizeTMO SchlickTMO KuangTMO TM2 TM4 TM6 TM8 TM10 TM12 TM14 TM16 AshikhminTMO DragoTMO ReinhardTMO LischinskiTMO MertensTMO PattanaikTMO VanHaterenTMO WardHistAdjTMO which denotes that Proposed-Q2 is more robust than Proposed-Q1 and Proposed-Q3. Thus Proposed-Q2 is more suitable to embed watermark. 4.2 Discussion of the embedding strength ∂min and ∂max are related to the trade-off between the invisibility and robustness of the proposed method. In the experiment, the embedding strength ∂min and ∂max are set to different values, and ∂max is greater than ∂min. The watermarked HDR videos are obtained by using the proposed method under different embedding strengths, and the invisibility and robustness against different TM attacks are computed. In order to obtain optimal ∂min and ∂max for different HDR videos, Y is calculated under different embedding strengths [38]: 5 1 ð21Þ Y ¼ qc þ ∑ ð1−BERc Þ 100 ; 5 c¼1 where c is the type of the TM attacks, which are TM2, TM10, TM12, TM15 and TM16, respectively. qc is the imperceptibility index of the watermarked HDR video with different ∂min and ∂max. BERc is the bit error rate of extraction under the cth TM attacks. The large Y means that the corresponding ∂min and ∂max are the most suitable for embedding watermark. Let Tibul be an example, when ∂min = 0.01 and ∂max = 0.03, Y has the maximum value. Similarly, ∂min and ∂max of other HDR videos can be obtained as shown in Table 3. 4.3 Invisibility and robustness Table 4 shows values of HDR-VDP75%, HDR-VDP95% and V, and it is obviously that averages of HDR-VDP75%, HDR-VDP95% and V are 6.337%, 3.454% and 76.268, respectively, which denotes the watermarked HDR video cannot be observed by human visions as illustrated in Table 2 Average BER of all HDR videos Attacks Proposed-Q1 Proposed-Q2 Proposed-Q3 TM1 TM2 TM4 TM6 TM13 0.0606 0.0412 0.0535 0.0636 0.1886 0.0177 0.0044 0.0048 0.0035 0.0122 0.0256 0.0383 0.0368 0.0234 0.0568 33386 Multimedia Tools and Applications (2022) 81:33375–33395 Table 3 The embedding strength of HDR videos HDR video ∂min ∂max Tibul Sunrise Playground ChristmasTree 0.01 0.02 0.04 0.01 0.03 0.04 0.06 0.03 Table 4 Invisibility of the HDR videos HDR videos HDR-VDP75% (100%) HDR-VDP95% (100%) V Tibul Sunrise Playground ChristmasTree Average 3.170 22.160 0.012 0.006 6.337 1.450 12.360 0.004 0.003 3.454 76.987 74.444 72.929 80.713 76.268 Fig. 4. Figure 5 shows nearly 100% of watermark can be extracted from different watermarked HDR videos, when these watermarked HDR videos are not under any attacks. In order to prove the robustness of the proposed HDR video watermarking method on the TM attacks, TM attacks are operated on Playground and ChristmasTree, respectively, as shown in Table 5. Figure 6 shows the attacked Playground by using the part of TM attacks. Form the Table 5, we can see that averages BER of Playground and ChristmasTree are 0.0095 and 0.0089, respectively, which denotes that the proposed method can resist TM attacks efficiently. (a) Tibul (b) Sunrise (c) Playground (d) ChristmasTree Fig. 4 Watermarked HDR video Multimedia Tools and Applications (2022) 81:33375–33395 33387 (a) Tibul (BER = 0.0006) (b) Sunrise (BER = 0.0002) (c) Playground (BER = 0.0007) (d) ChristmasTree(BER = 0.0006) Fig. 5 The extracted watermark image Table 5 BER under 18 types of TM attacks TM attacks Playground ChristmasTree TM attacks Playground ChristmasTree TM1 TM3 TM5 TM7 TM9 TM11 TM13 TM15 0.0131 0.0047 0.0065 0.0241 0.0013 0.0008 0.0039 0.0570 0.0073 0.0055 0.0089 0.0263 0.0012 0.0006 0.0050 0.0380 TM2 TM4 TM6 TM8 TM10 TM12 TM14 TM16 0.0022 0.0035 0.0012 0.0009 0.0301 0.0207 0.0012 0.0638 0.0044 0.0050 0.0022 0.0008 0.0134 0.0248 0.0011 0.0347 (a) TM1 (b) TM2 (c) TM5 (e) TM6 (f) TM7 (g) TM8 Fig. 6 Attacked Sunrise by using TM attacks 33388 Multimedia Tools and Applications (2022) 81:33375–33395 In order to show robustness on video attacks and hybrid attacks, a variety of video attacks and hybrid attacks are operated on Tibul and Sunrise as shown in Table 6, such as frame average, and H.265 (encoder_intra_main10, GOP = 1, QP = 22). Frame average is to compute the average of multiple frames, which will destroy watermark. However, the proposed method shows strong robustness, and corresponding BERs are 0.0020 and 0.0505, respectively. Since watermark is embedded into each key frame of the video, frame swapping and frame dropping will not destroy watermark, and nearly 100% of watermark can be extracted. BERs of Tibul and Sunrise are lower than 0.04, and averages BER of Tibul and Sunrise are 0.0167 and 0.0246, respectively, which denotes that the proposed method has strong robustness as well. Figure 7 shows watermark extraction from Tibul, which denotes that the extracted watermark can be recognized. Table 6 BER of video attacks Attacks Tibul Sunrise Frame average Frame swapping (30%) Frame dropping (30%) Frame dropping (50%) H.265 Salt & Pepper (0.001)+TM1 Passion + TM2 Gaussian filter (3×3)+TM6 Sharpen(0.5)+TM8 Average 0.0020 0.0006 0.0006 0.0006 0.0609 0.0281 0.0090 0.0124 0.0359 0.0167 0.0505 0.0002 0.0002 0.0002 0.0282 0.0275 0.0652 0.0230 0.0268 0.0246 (a) Frame average (b) Frame swapping (30%) (c) Frame dropping (30%) (d) Frame dropping (50%) (e) H.265 (g) Salt & Pepper (0.001)+ TM1 (h) Passion + TM2 (f) Gaussian filter (3×3) + TM6 (I) Sharpen(0.5)+ TM8 Fig. 7 Watermark extractions from Tibul Multimedia Tools and Applications (2022) 81:33375–33395 33389 4.4 Comparative In order to prove the effectiveness of the HVS model, the proposed method without the guidance of the HVS model, namely, proposed-WH, is tested as shown in Table 7. Let Tibul and ChristmasTree be an example. From Table 7, we can see that V of the proposed method is similar to that of the proposed-WH, but obviously BERs of the proposed method are better than those of the proposed-WH, which denotes that the proposed method is more robust than proposed-WH. Under the guidance of the HVS model, the HDR video watermarking method can obtain more robustness. Furthermore, to prove the effectiveness of the proposed method, Bakhsh’s [5], Kang’s [17] and Joshi’s [16] methods are used to be compared as shown in Table 8 when Sunrise is under different attacks. For the Table 8, we can see that the imperceptibility of the proposed method is obviously higher than those of Bakhsh’s [5], Kang’s [17] and Joshi’s [16], and mostly BERs of the proposed method are lower than those of Bakhsh’s [5], Kang’s [17] and Joshi’s [16] especially for TM1 and TM11. For example, for TM4, BERs of the proposed method are nearly 0.09, 0.06 and 0.3 lower than those of Bakhsh’s [5], Kang’s [17] and Joshi’s [16], respectively. For TM9, BERs of the proposed method are nearly 0.01, 0.1 and 0.3 lower than those of Table 7 Comparisons with Proposed-WH Tibul ChristmasTree V TM1 TM2 TM3 TM13 TM14 V TM1 TM2 TM3 TM10 TM13 Proposed Proposed-WH 76.987 0.0269 0.0075 0.0090 0.0050 0.0016 80.550 0.0073 0.0044 0.0055 0.0134 0.0050 77.240 0.0454 0.0149 0.0166 0.0079 0.0021 80.713 0.0117 0.0487 0.0401 0.0572 0.0263 Table 8 Comparisons on different attacks Attacks Proposed Bakhsh’s [5] Kang’s [17] Joshi’s [16] V TM1 TM4 TM5 TM7 TM9 TM10 TM11 TM13 Gaussian filter (3×3)+TM6 Sharpen(0.5)+TM8 Passion + TM2 Average 74.444 0.0234 0.0084 0.0339 0.0239 0.0401 0.0631 0.0011 0.0350 0.0230 0.0268 0.0652 0.0313 67.3772 0.0607 0.0915 0.0512 0.0173 0.0594 0.0750 0.0495 0.0552 0.0897 0.0348 0.0556 0.0582 70.0520 0.1445 0.0635 0.1582 0.0791 0.1191 0.1396 0.0137 0.1152 0.1250 0.1104 0.2148 0.1166 72.0132 0.3956 0.3849 0.3861 0.3849 0.3235 0.3440 0.3485 0.4199 0.3944 0.3851 0.4135 0.3800 33390 Multimedia Tools and Applications (2022) 81:33375–33395 Table 9 Average BER of the HDR video database [3] Attacks TM3 TM8 TM11 TM13 Proposed Bakhsh’s [5] Kang’s [17] Joshi’s [16] 0.0389 0.0784 0.1025 0.0987 0.0246 0.0981 0.2085 0.1189 0.0358 0.1011 0.0977 0.1278 0.0486 0.1023 0.2145 0.2793 Bakhsh’s [5], Kang’s [17] and Joshi’s [16], respectively. Compared with Bakhsh’s [5], although BERs of the proposed method are higher for TM7 and Passion + TM2, lower for other attacks, such as TM1, TM5 and Gaussian filter (3 × 3) + TM6. Compared with Kang’s [17] and Joshi’s [16], the proposed method is obviously better. In related to all attacks, the proposed method performs best since average of BER is nearly 0.02, 0.1 and 0.3 lower than those of Bakhsh’s [5], Kang’s [17] and Joshi’s [16], respectively. Above all, the proposed method has the strong capability of protecting the copyright of the HDR video. 4.5 Robustness on other HDR videos One HDR video database [3] including 10 HDR videos is used to demonstrate the robustness of the proposed method again. Compared with Bakhsh’s [5], Kang’s [17] and Joshi’s [16] methods as shown in Table 9, BERs of the proposed method are obviously lower, which proves that the effectiveness of the proposed method again. 5 Conclusion In this paper, a robust HDR video watermarking method based on the HVS model and T-QR is proposed. The key frames are extracted by using scene change detection, and each key frame can be regarded as the third-order tensor to obtain robust domain by using T-QR. After T-QR decomposition, the orthogonal tensor is calculated, which consists of the first, second and third matrices. Compared with the other two matrices, the second matrix is more robust, and therefore it is more suitable to embed watermark. The HVS model is computed by using luminance perception, image contrast and contrast masking, which can determine the embedding strength. Experimental results show that the proposed method can resist various attacks, and effectively protect the copyright of the HDR videos. In the future work, we will further explore visual perception factors of the HDR video to guide watermark embedding for improving the watermarking efficiency. Acknowledgments This work was supported by Natural Science Foundation of China under Grant No. 61971247 and 61501270, Zhejiang Provincial Natural Science Foundation of China under Grant No. LY22F020020 and LQ20F010002, Natural Science Foundation of Ningbo under Grant No. 2021J134. It was also sponsored by the K. C. Wong Magna Fund in Ningbo University. Declarations Conflict of interest We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position Multimedia Tools and Applications (2022) 81:33375–33395 33391 presented in, or the review of, the manuscript entitled, “Robust HDR video watermarking method based on the HVS model and T-QR”. References 1. Abdulla AA (2015) Exploiting similarities between secret and cover images for improved embedding efficiency and security in digital steganography. http://bear.buckingham.ac.uk/149/ 2. Alias S, Ramakrishnan SP (2018) Fibonacci based key frame selection and scrambling for video watermarking in DWT–SVD domain. Wirel Pers Commun 102:2011–2031. https://doi.org/10.1007/ s11277-018-5252-1 3. Azimi M, Banitalebi-Dehkordi A, Dong Y, Nasiopoulos P (2018) Evaluating the performance of existing full-reference quality metrics on high dynamic range (HDR) video content. ICMSP 2014: XII International Conference on Multimedia Signal Processing 4. Bai Y, Jiang G, Yu M, Peng Z, Chen F (2018) Towards a tone mapping robust watermarking algorithm for high dynamic range image based on spatial activity. Signal Process Image Commun 65:187–200. https:// doi.org/10.1016/j.image.2018.04.005 5. Bakhsh FY, Moghaddam ME (2018) A robust HDR images watermarking method using artificial bee colony algorithm. J Inf Secur Appl 41:12–27. https://doi.org/10.1016/j.jisa.2018.05.003 6. Balasamy K, Suganyadevi S (2021) A fuzzy based ROI selection for encryption and watermarking in medical image using DWT and SVD. Multimed Tools Appl 80:7167–7186. https://doi.org/10.1007/s11042020-09981-5 7. Chen Y, Jia ZG, Peng Y, Peng YX, Zhang D (2021) A new structure-preserving quaternion QR decomposition method for color image blind watermarking. Signal Process 185:108088. https://doi.org/10.1016/j. sigpro.2021.108088 8. Cheng YM, Wang CM (2009) A novel approach to steganography in high-dynamic-range images. IEEE Multimed 16:70–80 https://doi.ieeecomputersociety.org/10.1109/MMUL.2009.43 9. Chi B, Yu M, Jiang GY, He ZY, Chen F (2020) Blind tone mapped image quality assessment with image segmentation and visual perception. J Vis Commun Image Represent 67:102752. https://doi.org/10.1016/j. jvcir.2020.102752 10. Choudhury A (2020) Robust HDR image quality assessment using combination of quality metrics. Multimed Tools Appl 79:22843–22867. https://doi.org/10.1007/s11042-020-08985-5 11. Ernawan F, Kabir MN (2020) A block-based RDWT-SVD image watermarking method using human visual system characteristics. Vis Comput 36:19–37. https://doi.org/10.1007/s00371-018-1567-x 12. Guerrini F, Okuda M, Adami N, Leonardi R (2011) High dynamic range image watermarking robust against tone-mapping operators. IEEE Trans Inf Forensics Secur 6:283–295. https://doi.org/10.1109/TIFS.2011. 2109383 13. Hao N, Kilmer ME, Braman K, Hoover RC (2013) Facial recognition using tensor-tensor decompositions. SIAM J Imaging Sci 6:437–463. https://doi.org/10.1137/110842570 14. Hu R, Xiang S (2020) Cover-lossless robust image watermarking against geometric deformations. IEEE Trans Image Process 30:318–331. https://doi.org/10.1109/TIP.2020.3036727 15. Hu J, Shao Y, Ma W, Zhang T (2015) A robust watermarking scheme based on the human visual system in the wavelet domain. In: 2015 8th international congress on image and signal processing (CISP), pp 799– 803. https://doi.org/10.1109/CISP.2015.7407986 16. Joshi A, Gupta S, Girdhar M, Agarwal P, Sarker R (2017) Combined DWT–DCT-based video watermarking algorithm using Arnold transform technique. In: Proceedings of the international conference on data engineering and communication technology, vol 468, pp 455–463. https://doi.org/10.1007/978-98110-1675-2_45 17. Kang X, Zhao F, Lin G, Chen Y (2018) A novel hybrid of DCT and SVD in DWT domain for robust and invisible blind image watermarking with optimal embedding strength. Multimed Tools Appl 77:13197– 13224. https://doi.org/10.1007/s11042-017-4941-1 18. Kang J, Hou JU, Ji SK, Lee H (2020) Robust spherical panorama image watermarking against viewpoint desynchronization. IEEE Access 8:2169–3536. https://doi.org/10.1109/ACCESS.2020.3006980 19. Khan IR, Huang Z, Farbiz F, Manders CM (2009) HDR image tone mapping using histogram adjustment adapted to human visual system. In: 2009 7th international conference on information, communications and signal processing, pp 1–5. https://doi.org/10.1109/ICICS.2009.5397652 20. Khare P, Srivastava V (2020) HT-IWT-DCT-based hybrid technique of robust image watermarking. Advances in VLSI, communication, and signal processing 683, pp 359–370. https://doi.org/10.1007/978981-15-6840-4_28 33392 Multimedia Tools and Applications (2022) 81:33375–33395 21. Khwildi R, Zaid AO (2020) HDR image retrieval by using color-based descriptor and tone mapping operator. Vis Comput 36:1111–1126. https://doi.org/10.1007/s00371-019-01719-1 22. Khwildi R, Zaid AO, Dufaux F (2021) Query-by-example HDR image retrieval based on CNN. Multimed Tools Appl 80:115413–115428. https://doi.org/10.1007/s11042-020-10416-4 23. Lai C (2011) An improved SVD-based watermarking scheme using human visual characteristics. Opt Commun 284:938–944. https://doi.org/10.1016/j.optcom.2010.10.047 24. Li J, Zhang C (2020) Blind and robust watermarking scheme combining bimodal distribution structure with iterative selection method. Multimed Tools Appl 79:1373–1407. https://doi.org/10.1007/s11042-01908213-9 25. Li MT, Huang NC, Wang CM (2011) A data hiding scheme for high dynamic range images. Int J Innov Comput Inf Control 7:2021–2035 26. Lin YT, Wang CM, Chen WS, Lin FP (2017) A novel data hiding algorithm for high dynamic range images. IEEE Trans Multimed 19:196–211. https://doi.org/10.1109/TMM.2016.2605499 27. Luo Y, Peng D (2021) A robust digital watermarking method for depth-image-based rendering 3D video. Multimed Tools Appl 80:14915–14939. https://doi.org/10.1007/s11042-020-10375-w 28. Maiorana E, Campisi P (2016) High-capacity watermarking of high dynamic range images. EURASIP J Image Video Process:1–15. https://doi.org/10.1186/s13640-015-0100-7 29. Mantiuk R, Myszkowski K, Seidel HP (2006) A perceptual framework for contrast processing of high dynamic range images. ACM Trans Appl Percept 3:286–308. https://doi.org/10.1145/1166087.1166095 30. Mantiuk R, Kim KJ, Rempel AG, Heidrich W (2011) HDR-VDP-2: a calibrated visual metric for visibility and quality predictions in all luminance conditions. ACM Trans Graph 30:1–14. https://doi.org/10.1145/ 2010324.1964935 31. Martin CD, Shafer R, LaRue B (2013) An order-p tensor factorization with applications in imaging. SIAM J Sci Comput 35:A474–A490. https://doi.org/10.1137/110841229 32. Mousavi S, Naghsh A (2021) A robust Plenoptic image watermarking method using graph-based transform. Multimed Tools Appl 80:14591–14608. https://doi.org/10.1007/s11042-021-10555-2 33. Nouioua I, Amardjia N, Belilita S (2018) A novel blind and robust video watermarking technique in fast motion frames based on SVD and MR-SVD. Secur Commun Netw. https://doi.org/10.1155/2018/6712065 34. Sang J, Liu Q, Song CL (2020) Robust video watermarking using a hybrid DCT-DWT approach. J Electron Sci Technol 100052. https://doi.org/10.1016/j.jnlest.2020.100052 35. Solachidis V, Maiorana E, Campisi P (2013) HDR image multi-bit watermarking using bilateral- filteringbased masking. Image Processing: Algorithms and Systems XI 8655, pp 865505. https://doi.org/10.1117/ 12.2005240 36. Wu J (2012) Robust watermarking framework for high dynamic range images against tone mapping attacks, in Watermarking 37. Yu CM, Wu KC, Wang CM (2011) A distortion-free data hiding scheme for high dynamic range images. Displays. 32:225–236. https://doi.org/10.1016/j.displa.2011.02.004 38. Yu M, Wang Y, Jiang GY, Bai Y, Luo T (2019) High dynamic range image watermarking based on Tucker decomposition. IEEE Access 7:113053–113064. https://doi.org/10.1109/ACCESS.2019.2935627 39. Yuan Z, Liu D, Zhang X, Wang H, Su Q (2020) DCT-based color digital image blind watermarking method with variable steps. Multimed Tools Appl 79:30557–30581. https://doi.org/10.1007/s11042-020-09499-w 40. Zhang Y, Sun YY (2019) An image watermarking method based on visual saliency and contourlet transform. Optik 186:379–389. https://doi.org/10.1016/j.ijleo.2019.04.091 41. Zhou RG, Hu W, Fan P, Luo G (2018) Quantum color image watermarking based on Arnold transformation and LSB steganography. Int J Quantum Inf 16. https://doi.org/10.1142/S0219749918500211 Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Multimedia Tools and Applications (2022) 81:33375–33395 33393 MENG DU is currently pursuing the master’s degree with the Faculty of Information Science and Engineering, Ningbo University. She is currently a student with the Faculty of Information Science and Engineering, Ningbo University. Her research interests include multimedia security and image processing. TING LUO received the Ph.D. degree with the Faculty of Information Science and Engineering, Ningbo University in 2016. He is currently a Professor with the College of Science and Technology, Ningbo University. His research interests include multimedia security, image processing, data hiding, and pattern recognition. 33394 Multimedia Tools and Applications (2022) 81:33375–33395 HAIYONG XU is currently pursuing the Ph.D. degree with the Faculty of Information Science and Engineering, Ningbo University. He is currently a Teacher with the College of Science and Technology, Ningbo University. His research interests include multimedia communication, image processing, and machine learning. YANG SONG received the M.S. degree and Ph.D. degree from Ningbo University, in 2015 and 2018. He is currently a teacher with the College of Science and Technology, Ningbo University. His research interests include multimedia communication, image processing, and visual quality assessment. Multimedia Tools and Applications (2022) 81:33375–33395 33395 Chunpeng Wang received the B.E. degree in computer science and technology in 2010 from Shandong Jiaotong University, China, the M.S. degree from the School of Computer and Information Technology, Liaoning Normal University, China, 2013, and the Ph.D. degree in Faculty of Electronic Information & Electrical Engineering, Dalian University of Technology, China, 2017. He is currently a teacher with the School of Information, Qilu University of Technology (Shandong Academy of Sciences), China. His research mainly includes image watermarking and signal processing. Li Li is currently with Hangzhou Dianzi University, where she was a professor from 2002 to 2005. Her research interests include digital image watermarking and computer animation. Her current research interests include image/video/3D mesh watermarking, QR code, and image/video processing. She received the B.S. and M.S. degrees in mathematics in 1994 and 1997, respectively, from Zhejiang University, where she received the Ph.D. degree in computer science in 2004.

Robust HDR Video Watermarking with HVS and T-QR

Related documents

Products

Support

Robust HDR Video Watermarking with HVS and T-QR

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib