Advanced
Video Content-Based Bit Rate Estimation Scheme for Transcoding in IPTV Services
Video Content-Based Bit Rate Estimation Scheme for Transcoding in IPTV Services
KSII Transactions on Internet and Information Systems (TIIS). 2014. Mar, 8(3): 1040-1057
Copyright © 2014, Korean Society For Internet Information
  • Received : October 31, 2013
  • Accepted : January 04, 2013
  • Published : March 28, 2014
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Hye Jeong Cho
AV Research and Development Laboratory, ARION Technology Inc., Gyeonggi-Do, Korea
Chae-Bong Sohn
Department of Electronics and Communications Enginering, Kwangwoon University Seoul, Korea
Seoung-Jun Oh
Department of Electronic Enginering, Kwangwoon University Seoul, Korea

Abstract
In this paper, a new bit rate estimation scheme is proposed to determine the bit rate for each subclass in an MPEG-2 TS to H.264/AVC transcoder after dividing an input MPEG-2 TS sequence into several subclasses. Video format transcoding in conventional IPTV and Smart TV services is a time-consuming process since the input sequence should be fully transcoded several times with different bit-rates to decide the bit-rate suitable for a service. The proposed scheme can automatically decide the bit-rate for the transcoded video sequence in those services which can be stored on a video streaming server as small as possible without losing any subject quality loss. In the proposed scheme, an input sequence to the transcoder is sub-classified by hierarchical clustering using a parameter value extracted from each frame. The candidate frames of each subclass are used to estimate the bit rate using a statistical analysis and a mathematical model. Experimental results show that the proposed scheme reduces the bit rate by, on an average approximately 52% in low-complexity video and 6% in high-complexity video with negligible degradation in subjective quality.
Keywords
1. Introduction
B roadcasting and communications convergence services use limited networks to deliver IPTV, Smart TV, and other Internet services to consumers. The compression technology of serviced video is applied according to two business models: Managed Network and Open Internet. IPTV service providers deliver the H.264/AVC video content through the Managed Network. In the Open Internet, the service is delivered over the public Internet and should enable access to video content not only from TV sets but also from other home devices, such as portable multimedia players and laptop computers. The scalable video coding (SVC) technology enables the system to consider the available bandwidth for other devices. The fully implemented SVC, however, also comes with some increase in complexity and bit rate for the same fidelity as compared with single-layer coding [1] . A further study is needed on how to best control the SVC rate according to the network resource availability [2] . Most IPTV services are focused on delivering high-resolution/high-quality video over the Managed Network, with supporting quality of service (QoS).
The MPEG-2 standard has been widely deployed in video distribution infrastructures, such as cable and satellite networks, as well as in several consumer applications, such as DVDs and DVRs. The H.264/AVC standard is used in many video streaming services limited by the network bandwidth and offers a significant reduction in the bit rate over earlier standards-based technologies such as MPEG-2 (65%) and MPEG-4 (40-50%) [3] [4] . The standard achieves better performance in terms of both the peak signal to noise ratio (PSNR) and visual quality at the same bit rate as compared with prior video coding standards.
In video streaming services with IPTV and Smart TV, a video transcoder is necessary to leverage the compression efficiency offered by H.264/AVC with broadcast quality content produced in the MPEG-2 format. To service video content over the Managed Network for users, Fig. 1 shows the process used for video content transmission.
PPT Slide
Lager Image
Video content transcoding process in an IPTV service.
In the transcoder, the input video is decoded by MPEG-2 and re-encoded by H.264/AVC at a fixed bit rate. After performing the validation of subjective quality, the video content is stored on a video streaming server and then serviced to users with varied, engaging content via a streaming server [5] . The encoded video content is usually delivered through constant bit rate (CBR) channels. The bit rate channels needed for SDTV and HDTV video can be as high as 2–3Mbps and 10–12Mbps, respectively. Each item of video content on a CBR channel does not take into account the content’s characteristics because it is encoded by two different fixed bit rates; however, the serviced video content varies from low-complexity video to high-complexity video. The former can be encoded with a bit rate less than the fixed bit rate, without degradation in subjective quality. In other words, the conventional scheme based on a fixed bit rate causes bandwidth loss and requires a huge amount of storage space on a streaming server. When the open IPTV service is activated later, IPTV service providers can deliver the content, which, unlike specific companies’ customized content, is a network resource that anyone can access. In order to deliver a considerable amount of content on a CBR channel, it is important to select an efficient bit rate.
Solving this problem requires a scheme capable of finding an appropriate bit rate for video content while maintaining a subjective quality equivalent to that of a scheme that uses a fixed bit rate. Employing this scheme requires determining a bit rate for video content prior to encoding it. A video transcoder can provide an additional controller that can also estimate the bit rate. A simple technique to estimate the video content’s bit rate is to vary the bit rate step in the H.264/AVC encoder part of the transcoder. The visual quality should be verified at each encoding pass. Even though this method can provide an accurate bit rate, it is a very time-consuming process. The time required to estimate the bit rate should be minimized to meet the video streaming service requirements.
In this paper, a scheme is proposed for automatically estimating the bit rate of each subclass without the repeated full encoding and subjective quality test. Using parameters, the video content is divided into several segments. To estimate the bit rate of each segment, candidate frames are extracted, which include intra-frames that require a high number of bits. Finally, the bit rate of each segment is estimated by statistical analysis and a mathematical model based on a given target quality. The remainder of this paper is organized as follows. Section II explains the analysis of video content with respect to the quality and bit rate. Section III proposes a bit rate estimation scheme for unsupervised segmentation using the frame complexity of video content. Then, the experimental results and conclusions are presented in Sections IV and V, respectively.
2. Analysis of the Quality and Bit Rate of Video Content
The purpose of this analysis is to examine the human perceived quality corresponding to the bit rates of a video. The subjective quality of the H.264/AVC encoded video is evaluated, in which a low-complexity content category such as “lecture” is coded at bit rates from 1.0 to 2.5Mbps. The evaluation is performed using the double-stimulus continuous quality scale (DSCQS) method of ITU-R Rec. BT.500-7 [6] . All the coded stimuli are rated by each of the five viewers. General conclusions were based on the quality ratings of the presented stimuli. The main idea of measuring the DSCQS score is to determine the differential mean opinion score (DMOS) between the reference encoded at 2.5Mbps and the test sequences averaged by all the viewers. A DMOS value, dMOS , is defined as follows:
PPT Slide
Lager Image
where MOSr is the MOS of the reference sequence encoded at 2.5Mbps, and MOSp is the MOS of the test sequence encoded below 2.5Mbps. The task is to assess the degradation of the test sequence with respect to the reference sequence. If dMOS is near “0”, then the test sequence is similar to the reference sequence. Fig. 2 shows the result of the average of all dMOS ’s in a low-complexity video. The quality degradation determined by the video encoded bit rate was, on an average, 1.4Mbps. Therefore, the low-complexity video can encode a bit rate lower than 2.5Mbps, with negligible degradation of subjective quality.
PPT Slide
Lager Image
Result of quality evaluation.
Further, the difference between the variable bit rate (VBR) at QP 22 and the CBR at 2.5Mbps is analyzed for the test sequence. As shown in Fig. 3 , some video content can be encoded at a lower bit rate than at the fixed bit rate. Video content can be divided into two or three subclasses in terms of the quality of experience (QoE). It can also be delivered using more than one bit rate according to subclasses in a CBR channel.
PPT Slide
Lager Image
The differential ratio between VBR and CBR.
3. Proposed Scheme
In this section, a bit rate estimation scheme is proposed that reduces the bit rate while maintaining the target quality in video streaming services limited by the network bandwidth. Fig. 4 shows a block diagram of the proposed scheme. Given an input sequence as MPEG-2 TS, the TS parser is used to gather MPEG-2 video data and their data is decompressed by MPEG-2 decoder. Deinterlacer performs deinterlacing interlaced video frames to progressive video frames because a common way to compress video is to interlace it. Using those parameters, the frames of video can be divided into several segments. To estimate the bit rate of each segment, candidate frames are extracted, which includes intra-frames that require a large number of bits. Finally, the bit rate of each segment is estimated by statistical analysis and a mathematical model based on the target quality. The input video is re-encoded by H.264/AVC at estimated bit rate. After performing the validation of subjective quality, the video content is stored on a video streaming server.
PPT Slide
Lager Image
Block diagram of the proposed scheme.
The proposed scheme differs from the conventional scheme in that it employs a bit rate estimator. Because the proposed scheme does not encode full frames of video content, it is very important to determine parameters that can serve to indirectly measure a frame’s bits.
- 3.1 Frame Complexity Estimation for an Intra-frame
Some content complexity measurements for coding still images can be obtained without pre-encoding by using variance, edge, and gradient methods [7] . From the deviation of each macroblock (MB), the complexity can also be determined [8] . In the gradient-based method, the computation for calculating the gradient is low, and the output bit rate of each intra-frame is highly correlated [9] . These properties are highly desirable for measuring the complexity of an intra-frame. In addition to the gradient information, the histograms of luminance and chrominance pixel values are also very useful when combined with the gradient to represent the content complexity.
Given the arbitrary sth test sequence Q s , the set contains a number of groups of pictures (GOPs) specified in the order in which the intra- and inter-frames are arranged:
PPT Slide
Lager Image
where M is the total number of GOPs, and N is the number of frames in a GOP. Qs ( i , j ) denotes the j th frame of the i th GOP. Our objective is to measure the intra-frame complexity in Q s . In order to measure the frame complexity, the complexity measurement defined in [10] , FCintra , is used. The value of FCintra for Qs ( i , j ) ∈ Q s , CC( Qs ( i , j )), can be computed by (3).
PPT Slide
Lager Image
where
PPT Slide
Lager Image
PPT Slide
Lager Image
In (3), Grads,i and SOHs,i are the gradient and the statistic, respectively, of the histogram information of the i th intra-frame. Ys,i ( x , y ) is the luminance value of pixel ( x , y ) in the i th frame. Us,i ( x , y ) and Vs,i ( x , y ) are the corresponding chrominance values. KYLY , KULU , and KVLV are the sizes of the Y-, U-, and V-frames in Qs (i,1) . HYs,i [ l ] is the histogram of the luminance level l , and HUs,i [ l ] and HVs,i [ l ] are the histograms corresponding to the chrominance level l .
To investigate the relationship between the actual number of encoded bits and FCintra , various test sequences were extensively encoded using the intra-coding mode under constant quantization parameters (QPs), and both the number of encoded bits and the FCintra for each frame were recorded. Fig. 5 shows the scatter plots of the number of bits versus FCintra at different QPs in our test content, where each dot represents a frame. Fig. 5 also shows the accuracy of the linear approximations (as blue dotted lines) by plotting the correlation coefficient r , which is an indicator of how closely the approximated linear relationship represents the actual data. The value of r lies between -1 and 1. For the test sequences, the value of r between the number of bits and FCintra is, on an average, 0.93. When the value of r is at or near 1, the approximated linear relationship is the most reliable. Therefore, it is clear that a linear relationship exists in our test sequences with different slopes, and (3) can be used accurately to estimate the number of bits for intra-frames.
PPT Slide
Lager Image
Scatter plots of the number of encoded bits versus FCintra: (a) Documentary, (b) Lecture, (c) Religion, and (d) Sports.
- 3.2 Hierarchical Clustering-Based Video Sub-classification
Each of the subclasses—clusters, or groups of patterns of FCintra —has a similar number of bits. The classifier for FCintra is designed by hierarchical clustering with Bayesian decision theory [11] .
Consider a sequence T containing n samples and c clusters. To conduct agglomerative hierarchical clustering for FCintra , the number of initial clusters, n , is determined by analyzing the temporal characteristic between frames. The scaled-invariant feature transform (SIFT) is sequentially applied to detect stable frames among temporal frames [12] . Let T ( x , y , t ) be the ordinal signature of the ( x , y )th block of the t th frame in T . Gσ ( x , y , t ) defines a 3×3×3 Gaussian kernel with standard deviation σ as follows:
PPT Slide
Lager Image
A 3×3×3 difference-of-Gaussian (DoG) kernel [13] is derived by computing the difference between two Gaussian kernels as follows:
PPT Slide
Lager Image
where k > 1 is a multiplicative factor, and s = 1,2,…, is the scale of the DoG kernel. Then, the DoG kernel sliding over T is used to generate a vector ψ by the convolution operation as follows:
PPT Slide
Lager Image
for t = 1,…, m . If the t th element in ψ is a local extreme, it is considered to be a key frame in T . In this paper, the parameters are set to σ = 1.8,
PPT Slide
Lager Image
, and s = 3. A sequence consists of the static subclass ω 0 and dynamic subclass ω 1 divided by distribution of ψ . The two subclasses are defined as follows:
PPT Slide
Lager Image
where ω 0 denotes the same value between the t th element and ( t -1)th element in ψ , whereas ω 1 denotes the different value between them. The number of initial clusters n is decided by the intervals of successive ω 0 ’s and the number of ω 1 ’s. Fig. 6 shows the number of initial clusters in a sequence.
PPT Slide
Lager Image
Examples of the number of initial clusters
To show ω 0 and ω 1 for the distribution of frame variations, the lines in the figure denote 0 and 1 for ω 0 and ω 1 , respectively. The number of initial clusters in a sequence is finally 71 as shown in Fig. 6 . Each cluster center is the average of FCintra ’s in ω 0 and an FCintra in ω 1 , respectively. The measure of the distance between two clusters uses the Euclidean metric [14] .
Given two clusters, whether they are in the same subclass or not is decided by the Bayesian decision theory. This approach is based on quantifying the trade-offs between various classification decisions using probability and the costs that accompany such decisions. It makes the assumption that the decision problem is posed in probabilistic terms and that all of the relevant probability values are known. More generally, assume that there is a prior probability P ( ωk ) of each subclass k . These prior probabilities reflect prior knowledge of how likely it is that the static or dynamic subclass can be obtained before a sequence actually appears. The difference between the representative FCintra ’s in the two clusters is measured. Its value x is considered to be a random variable whose distribution depends on the class and is expressed as p ( x | ωk ). To determine the subclass of a cluster, the following decision rule is used: decide ω 0 if P ( ω 0 | x ) > P ( ω 1 | x ); otherwise decide ω 1 . The decision rule can be expressed as follows:
PPT Slide
Lager Image
Suppose that both the prior probabilities P ( ωk ) and the conditional densities P ( x | ωk ) are known. It is known that the joint probability density of finding a pattern that is in subclass ωk and has feature value x can be written two ways: P ( ωk , x ) = P ( ωk | x ) p ( x ) = P ( x | ωk ) P ( ωk ). Bayes’ formula can be expressed as follows:
PPT Slide
Lager Image
Using (9), the decision rule of (8) can be rewritten as follows:
PPT Slide
Lager Image
The quantity on the left is called the likelihood ratio and is denoted by Λ( x )
PPT Slide
Lager Image
The quantity on the right-hand side of (10) is the threshold of the test and is denoted by η :
PPT Slide
Lager Image
Thus, the Bayes criterion leads to the likelihood ratio test (LRT) shown in (13):
PPT Slide
Lager Image
Owing to the goodness of fit between the actual data and the theoretical data, the distributions of P ( x | ω 0 ) and P ( x | ω 1 ) are assumed to have an approximately exponential distribution:
PPT Slide
Lager Image
where k is 0 or 1 of each subclass ω , and αk and βk are the model’s parameters. In this paper, the prior probabilities P ( ω 0 ) and P ( ω 1 ) for test sequences are investigated as shown in Table 1 . On an average, P ( ω 0 ) is 0.93, and P ( ω 1 ) is 0.07. The model parameter values are α 0 = 1,140,000, β 0 = 2.824, α 1 = 2,810, and β 1 = 0.390.
Prior probabilities according to test sequences
PPT Slide
Lager Image
Prior probabilities according to test sequences
Using (13), it can be determined whether the given two clusters are merged or not: two clusters are merged if Λ( x ) is greater than η . Finally, c clusters can be obtained according to FCintra distribution, as shown in Fig. 7 .
PPT Slide
Lager Image
Relationship between FCintra distribution fc and the final clusters
Although the correlation between FCintra and the number of bits is high, the maximum FCintra frame does not always have the maximum number of encoded bits. Thus, the candidate intra-frame needs to be extracted. The candidate frame set H s contains intra-frames, and a candidate frame in H s is the frame that requires more than a certain number of encoded bits. H s is specified in (15):
PPT Slide
Lager Image
In (15),
PPT Slide
Lager Image
is a candidate intra-frame, D is the number of candidate frames, M is the number of intra-frames, θ (•) is a nondecreasing mapping function from the integer set {1,…, M }, and μc is the average of FCintra ’s in each cluster. If CC( Qs (i,1) ) is greater than the content-adaptive threshold μc , the i th intra-frame is extracted as
PPT Slide
Lager Image
of the c th cluster.
- 3.3 Model-Based Bit Rate Estimation
Using candidate frames with FCintra value of each cluster, the bit rates of clusters can be estimated via statistical analysis and a mathematical model. To estimate the bit rate while maintaining the given PSNR quality, a PSNR-Q model derived from the H.264/AVC quantization process [15] is proposed in this paper. With this model, an estimated QP is determined and is finally applied to the bit rate estimation. The relationship between the quantization step size ( Qstep ) and QP is given in (16) as follows:
PPT Slide
Lager Image
where PF and MF are a post-scaling and a multiplication factor, respectively, in the H.264/AVC standard, and qbits = 15+floor (QP/6). When uniform quantization is applied to the uniformly distributed inputs, the mean square error ( MSE ) is given by
PPT Slide
Lager Image
From (16) and (17), the PSNR can be derived as
PPT Slide
Lager Image
where a and b are constants obtained by linear regression [16] . As a result, the value of QP can be estimated as
PPT Slide
Lager Image
where PSNRt is a given target PSNR, and QPe is an estimated QP.
Using QPe , the number of intra-frame bits is first estimated. Some parameters obtained by intra-frame estimation are used to estimate the number of inter-frames bits in a GOP. To estimate the number of intra-frame bits, a simple but effective Rate-Quantization (R-Q) model is used. An exponential relationship between the actual number of encoded bits and QP was modeled by Zhou and his colleagues [17] . For simplicity, the R-Q model for an intra-frame is defined as:
PPT Slide
Lager Image
where Rq ,1 ( QPe ) is the number of encoded bits for the q th candidate intra-frame at QPe , and αq and βq are the model parameters. To reveal the relationship between the number of encoded bits and QP, Fig. 8 shows several examples of curve-fitting results for intra-frames, with each small dot of the mathematically approximated curves representing the actual number of encoded bits of an intra-frame at each QP. Because αq and βq can be obtained by exponential regression, Rq ,1 can also be calculated by (20).
PPT Slide
Lager Image
R-Q curves for the test sequences. (a) Music video, (b) Lecture, (c) Sports, (d) Documentary
It is difficult to directly estimate the number of inter-frame bits in H.264/AVC. Thus, the bit rate conversion method introduced in [18] is used with the value of QPe instead of using the intra-frame R-Q model. The bit rate conversion is defined as
PPT Slide
Lager Image
where Rq,j +1 ( QPP ) is the number of encoded bits for the ( j +1)th inter-frame in the q th GOP at QPP , and G is a GOP size. As defined in (21), this method requires encoding a GOP at a certain value of QP, QPs , as a reference, that is, Rq,j +1 ( QPs ) is computed in advance. In experiments, the value of QPs used is 26. Furthermore, QPP is set to QPe +1 here because an inter-frame QP is an intra-frame QP+1 in H.264/AVC rate control. After estimating the number of intra- and inter-frame bits, the total number of bits for each GOP, Rq , can be estimated using (20) and (21) as follows:
PPT Slide
Lager Image
The bit rate of each cluster is estimated using the GOP that is expected to have the maximum number of encoded bits among all candidate frames in each cluster. If the same bit rate between clusters is estimated, these clusters are grouped as a segment. Finally, the number of segments in a sequence is less than or equal to the number of clusters.
4. Experimental Results
The performance of the proposed scheme is evaluated with several types of IPTV content. The proposed scheme will be called class-based bit rate estimation (CBRE) hereinafter, and the conventional scheme with a fixed bit rate of 2.5 Mbps will be called fixed bit rate estimation (FBRE) [19] . The standard definition (SD) resolution video content is categorized into four genres: lecture, religion and documentary, drama and animation, and music video and sports. A total of 30 videos in Table 2 are used as test sequences.
Test sequences
PPT Slide
Lager Image
Test sequences
In our experiment, the size of GOP is 15, and its type is set to IPPP. The target PSNR is set to 42dB. The simulated results encoded by FBRE can be compared in terms of the bit rate and quality to those encoded by CBRE. In order to evaluate the bit rate reduction, ΔR is calculated as follows:
PPT Slide
Lager Image
where
PPT Slide
Lager Image
and
PPT Slide
Lager Image
indicate the bit rates by FBRE and CBRE in the i th cluster, respectively.
Table 3 shows the results of bit rate reduction. CBRE can reduce the bit rate by up to 65.2% as compared with FBRE. CBRE can reduce the bit rate, on an average, by approximately 52% and 6% in low- and high-complexity video sequences, respectively. Because CBRE assigns the bit rate according to the complexity of each segment, a relatively high bit rate reduction in the low-complexity video class can be achieved.
Bit rate reduction ratios of CBRE
PPT Slide
Lager Image
Bit rate reduction ratios of CBRE
Since the bit rate can be estimated by encoding candidate frames instead of the total frames, the computational complexity for CBRE depends on the ratio of the number of candidate frames to the total number of frames. Fig. 9 shows these ratios in the test sequences.
PPT Slide
Lager Image
Ratios of the number of candidate frames to the total number of frames in test sequences
Table 4 shows that the difference in the PSNR performance is approximately 1.2dB on an average. However, that is too small a difference to affect the subject quality degradation in test sequences as shown in Fig. 10 , since the target bit rate is set to 40dB in (19), which makes it difficult to determine a subjective quality difference.
PSNR difference between FBRE and CBRE
PPT Slide
Lager Image
PSNR difference between FBRE and CBRE
PPT Slide
Lager Image
Subjective quality comparison: (a) CBRE and (b) FBRE
5. Conclusions
The transcoding bit-rate decision in conventional IPTV and Smart TV services is a time-consuming process since the input sequence should be fully transcoded several times with different bit-rates to decide a suitable bit-rate. This paper shows that the video bit rate in an MPEG-2 TS to H.264/AVC transcoder which is an essential device in those services can be automatically decided with keeping subjective video quality. The proposed bit rate estimation scheme was organized into two modules: one was hierarchical clustering-based sub-classification and the other was statistical analysis-based bit rate estimation. The input sequence was grouped as several subclasses by hierarchical clustering using the parameter value extracted from each frame. The candidate frames of each subclass were used to estimate the bit rate using statistical analysis and mathematical model. The bit rate could be automatically estimated by encoding only the candidate frames.
The proposed scheme could reduce the fixed bit rate, on an average, by 52% in low-complexity video and by 6% in high-complexity video while maintaining the subjective quality, respectively. For future work, we plan to study some practical issues for implementing the proposed scheme. Note that in real TV services, additional works need to be developed in order to simplify the proposed scheme, especially clustering-based video sub-classification. We also need to extend the results to HD test sequences.
BIO
Hye Jeong Cho received the B.S. degree in 2004 from the Department of Internet Information Engineering, Hanyang Women’s College, Seoul, Korea. In 2012, she received the joint M.S. and Ph.D. degree in electronic engineering, Kwangwoon University, Seoul, Korea. She is currently a senior engineer in AV Research and Development Laboratory, ARION Technology Inc., Gyeonggi-do, Korea. Her research interests include video processing, STB and IPTV video streaming services.
Chae-Bong Sohn received the B.S., M.S., and Ph.D. degree in electronic engineering from Kwangwoon University, Seoul, Korea in 1993, 1995, and 2006, respectively. He is currently an associate professor in department of Electronics and Communications Engineering, Kwangwoon University, Seoul, Korea. His research interests include image compression, transcoding, digital broadcasting systems.
Seoung-Jun Oh was born Seoul, Korea, in 1957. He received both the B.S. and the M.S. degrees in electronic engineering from Seoul National University, Seoul, in 1980 and 1982, respectively, and the Ph.D. degree in electrical and computer engineering from Syracuse University, New York, in 1988. In 1988, he joined ETRI, Daejeon, Korea, as a senior research member. From 1990 to 1992, he was a Director of Multimedia Research Section, ETRI. Since 1992, he has been a professor of Department of Electronic Engineering, Kwangwoon University, Seoul, Korea. He has been a chairman of SC29-Korea since 2001. His research interests include image and video processing, video coding, and object recognition.
References
Cycon H. L. , Schmidt T. C. , Wahlisch M. , Marpe D. , Winken M. 2011 “A temporally scalable video codec and its applications to a video conferencing system with dynamic network adaption for mobiles” IEEE Trans. Consumer Electron. Article (CrossRef Link). 57 (3) 1408 - 1415    DOI : 10.1109/TCE.2011.6018901
Park S. , Jeong S. H. 2008 “Mobile IPTV: approaches, challenges, standards and QoS support” IEEE Internet Comput. Article (CrossRef Link). 13 (3) 22 - 31
Wiegand T. , Sullivan G. J. , Bjontegaard G. , Luthra A. 2003 “Overview of the H.264/AVC video coding standard” IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link). 13 (7) 560 - 576    DOI : 10.1109/TCSVT.2003.815165
Joch A. , Kossentini F. , Schwarz H. , Wiegand T. , Sullivan G.J. 2002 “Performance comparison of video coding standards using Lagrangian coder control” in Proc. of IEEE Int. Conf. Image Processing Sep. vol. 2, Article (CrossRef Link). II-501 - 504
Kim T. , Bahn H. 2008 “Implementation of the storage manager for an IPTV set-top box” IEEE Trans. Consumer Electron. Article (CrossRef Link). 54 (4) 1770 - 1775    DOI : 10.1109/TCE.2008.4711233
2002 ITU-R Recommendation BT.500-11, “Methodology for the subjective assessment of the quality of television pictures,” ITU Article (CrossRef Link).
Kim Wook Joong , Yi Jong Won , Kim Seong Dae 1999 “A bit allocation method based on picture activity for still image coding” IEEE Trans. Image Process. Article (CrossRef Link). 8 (7) 974 - 977    DOI : 10.1109/83.772244
Li J. , Abdel-Raheem E. 2010 “Efficient rate control H.264/AVC intra frame” IEEE Trans. Consumer Electron. Article (CrossRef Link). 56 (5) 1043 - 1048    DOI : 10.1109/TCE.2010.5506037
Jing X. , Chau L.-P. , Siu W.-C. 2008 “Frame complexity-based rate-quantization model for H.264/AVC intraframe rate control” IEEE Trans. Signal Process. Lett. Article (CrossRef Link). 15 373 - 376    DOI : 10.1109/LSP.2008.920010
Zhou Y. , Sun Y. , Feng Z. , Sun S. 2009 “New rate-distortion modeling and efficient rate control for H.264/AVC video coding” Signal Process.: Image Commun. Article (CrossRef Link). 24 (5) 345 - 356    DOI : 10.1016/j.image.2009.02.014
Duda Richard O. , Hart Peter E. , Stork David G. 2000 Pattern Classification 2nd ed. Wiley-Interscience Article (CrossRef Link). 20 - 82
Chiu C.Y. , Chen C.S. , Chien L.F. 2008 “A framework for handling spatiotemporal variations in video copy detection” IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link). 18 (3) 412 - 417    DOI : 10.1109/TCSVT.2008.918447
Lowe G. 2004 “Distinctive image features from scale-invariant keypoints” Int. Journal of Computer Vision Article (CrossRef Link). 60 91 - 110    DOI : 10.1023/B:VISI.0000029664.99615.94
Deza M. M. , Deza E. 2009 Encyclopedia of Distances 1st ed. Springer Article (CrossRef Link). 89 - 100
Liu Y. , Li Z. G. , Soh Y. C. 2007 “A novel rate control scheme for low delay video communication of H.264/AVC standard” IEEE Trans. Circuits Syst. Video Technol. Article (CrossRef Link). 17 (1) 68 - 78    DOI : 10.1109/TCSVT.2006.887081
Edwards A.L. 1976 An Introduction to Linear Regression and Correlation W.H. Freeman Article (CrossRef Link). 33 - 46
Zhou Y. , Sun Y. , Feng Z. , Sun S. 2009 “New rate-distortion modeling and efficient rate control for H.264/AVC video coding” Signal Process.: Image Commun. Article (CrossRef Link). 24 (5) 345 - 356    DOI : 10.1016/j.image.2009.02.014
Tang Q. , Mansour H. , Nasiopoulos P. , Ward R. 2008 “Bit-rate estimation for bit-rate reduction H.264/AVC video transcoding in wireless networks” in Proc. of IEEE Int. Sym. Wireless Pervasive Comput. May Article (CrossRef Link). 464 - 467
Cho H. J. , Lee J. , Noh D. Y. , Jang S. H. , Kwon J. C. , Oh S. J. 2011 “A new video bit rate estimation scheme using a model for IPTV services” KSII Trans. Internet and Information Syst. Article (CrossRef Link). 5 (10) 1814 - 1829