Advanced
Focal Stack Based Light Field Coding for Refocusing Applications
Focal Stack Based Light Field Coding for Refocusing Applications
Journal of Broadcast Engineering. 2019. Dec, 24(7): 1246-1258
Copyright © 2016, Korean Institute of Broadcast and Media Engineers. All rights reserved.
This is an Open-Access article distributed under the terms of the Creative Commons BY-NC-ND (http://creativecommons.org/licenses/by-nc-nd/3.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited and not altered.”
  • Received : October 21, 2019
  • Accepted : December 13, 2019
  • Published : December 01, 2019
Download
PDF
e-PUB
PubReader
PPT
Export by style
Article
Author
Metrics
Cited by
About the Authors
Vinh Van, Duong
Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea.
Thuong Nguyen, Canh
He performed this work while in Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea. He is now with the Institute for Datability Science, Osaka University, Japan.
Thuc Nguyen, Huu
Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea.
Byeungwoo, Jeon
Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea.
bjeon@skku.edu

Abstract
Since light field (LF) image has huge data volume, it requires high-performance compression technique for efficient transmission and storage of its data. Camera users may like to represent parts of image at different levels of focus at their choice anytime. To address this refocusing functionality, in this paper, we first render a focal stack consisting of multi-focus images, then compress it instead of original LF data. The proposed method has advantage of minimizing the amount of LF data to realize the targeted refocusing applications. Our experiment results show that the proposed method outperforms the state-of-the-art for LF image compression method.
Keywords
Ⅰ. Introduction
Light Field (LF) camera benefits from computational photography techniques which allow to capture not only the light intensities but also the light directions coming to the camera. It can provide many interesting functionalities like digital refocusing, depth estimation, viewpoint change, and 3D reconstruction, etc. Ever since the first introduction of the LF concept by G. Lippmann in 1908 [1] , it has kept being investigated. For example, H. Ives developed the first digital LF recording device in 1968 [2] , M. Levoy applied the LF technology to microscopy in 2006 [3] , and K. Fife designed an LF CCD sensor in 2008 [4] . In 2012, Ren Ng has built a hand-held LF camera called Lytro camera [5] . It has a micro-lens array placed before the sensor in order to make it act like a micro camera array that records multiple views of the same scene having slightly different perspectives each other. However, the multiple functionalities of LF require much data to be stored (or transmitted). For example, the raw lenslet image taken by Lytro Illum camera is of a size of 7728x5368, and its data in 4D LF format (i.e., multi-view images) contains 225 views (or sub-aperture images (SAIs)) of 625x434 pixels each. Thus, one single LF image is about 200 Mbytes which is more than 30~40 times larger than a conventional image. Therefore, a powerful compression method is strongly required.
Many researchers have focused in past years on designing LF compression schemes. Some of recently noted are JPEG Pleno standardization on LF imaging in 2015 [6] , ICME 2016 grand challenge on LF image compression [7] , ICIP 2017 grand challenge on LF image coding [8] , and ICME 2018 grand challenge on densely sample LF reconstruction [9] . Most of the methods reported so far can be basically divided into two approaches: a) to compress the lenslet image by exploiting the redundancy within its structure; and b) to compress LF data in the 4D LF format by further exploiting the redundancies in multi-view images. However, very few attentions have been paid to their compression schemes for specific application(s) of LF in mind. That is, most developments have focused only on how to maximally reduce size of the data in general. In [10] , one study is carried out regarding the impact of light field compression on the focal stack and its extended focused images. The study showed that refocused image (RF) is more robust to blur than the extended focused image when it is affected by compression. Perra et al. [11] showed how subsampling of light field data before compression affects the end-users’ experience of quality in light field applications. By down-sampling the light field image at an appropriate ratio, one can achieve better compression without much loss of subjective quality. A comprehensive study about the perceptual quality of compressed light field and video streaming is also presented in [12] and [13] , respectively.
In this paper, we aim to design a new compression scheme that specifically targets for refocusing application only. LF image allows one to be able to change the image focus after the image is already taken, but it requires a large amount of data for the process to be possible at user’s side. Normally, we compress the 4D LF data first, and then render a set of its images focused at different depths (note that the set is called a focal stack). This approach can reduce the amount of data but not too much. By knowing a specific application, one can better render the focal stack images first and then compress its images instead of 4D LF data.
In literature, the focal stack is shown to be useful for estimating depth or creating an all-in-focus image (AIF). In [14] , A. Levin et al. shows that one can fully reconstruct the 4D LF from a focal stack. This means that on one hand, it can reduce a great amount of data to be compressed by converting 4D LF data to a focal stack, but on the other hand, it can still recover the original 4D LF for different applications. Slightly different from this general goal, in this paper, we only target for the refocusing fidelity in LF applications. That is, we assume that users only want to see their object of interest at different depth levels. For this specific scenario in mind, we first render a focal stack which contains all the depth layers, then compress the focal stack directly. By doing so, it is expected that one can minimize the amount of data for transmission and storage. It should be noted that the number of refocused images (RFI’s) is much smaller than that of SAIs.
In the rest of this paper, we first briefly review the concept of LF imaging and the refocusing algorithm in Section II. Then, we describe the conventional and our proposed LF compression framework in Section III. Following, the algorithm of reconstructed focus map and all-in focus image is presented in Section IV. Finally, Sections V and VI present experimental results and conclusion, respectively.
Ⅱ. Light Field Imaging: A Review
In order to better understand how LF camera captures image, here we briefly review the basic principle of LF imaging and its camera design structure. Different from the conventional camera, an image captured by an LF camera can be refocused. The refocus at different distance or depth is controlled by a parameter of refocusing function, denoted by ρ . A focal stack refers to a set of images generated by a few different ρ values. Very basic of the rendering algorithm of focal stack is described below, but more detail can be found in [15] .
- 1. Structure of light field camera
The plenoptic function is a 7D function P ( x,y,z,γ,θ,λ,t ) which describes the light travelling from the real world to the observer (e.g., camera). It is a function of spatial coordinate ( x,y,z ), direction of light ray ( γ,θ ), wavelength λ , and time t , all of which allows one to reconstruct image from every possible viewpoint. The function having many parameters can be simplified by assuming static scene (so dropping time t ) and a fixed RGB color spaces (so dropping the wavelength λ ), so the plenoptic function can be written P ( x,y,z,γ,θ ). It is still described in terms of the position and the direction of incoming light to camera sensor. Levoy et al. [16] represented the function as L ( u 1 , u 2 , s 1 , s 2 ) using only four parameters by parameterizing the finite space between two planes. Fig. 1(a) denotes the main lens as the first plane described by coordinate ( u 1 , u 2 ), where micro-lens array (MLA) is the second plane described by coordinate ( s 1 , s 2 ).
The first commercial plenoptic camera (known as Lytro camera) which can capture LF data has installed a MLA before the sensor of the conventional camera [5] as in Fig. 2 . A 2D sensor is placed behind the MLA at a distance β , where β is the focal length of the micro-lens. In this structure, the micro-lens separates incoming light rays by directions and the sensor elements of the 2D array behind each micro-lens records the separated lights. Once the raw LF image is captured, one needs to decode the raw LF image, R ( r ), into the 4D representation of the LF L ( u, s ), where r = [ r 1 , r 2 ] T is a vector representing a spatial location on the sensor; u = [ u 1 , u 2 ] T and s = [ s 1 , s 2 ] T indicate respectively positions shown in Fig. 1(b) . If the LF camera has M × N micro-lens and P × P pixels behind each micro- lens, one can extract P × P views of Iu ( s ). This process can be done by using the Light Field Matlab Toolbox in [17] . For example, the decoded 4D LF captured by Lytro Illum camera which we use in our paper has 15 × 15 views, and each view has a size of 635 × 434 pixels.
PPT Slide
Lager Image
(a) Light field representation as a 4D function L(u1, u2, s1, s2); (b) All the light directions hitting the main lens at a coordinate (u1, u2) are recorded in the sensor behind each micro-lens. The spatial coordinate (s1, s2) corresponds to micro-lens array
PPT Slide
Lager Image
Structure of LF camera (Lytro Illum camera). The camera captures an image of Object A at distance z0 with its sensor array placed behind the main lens at x0. The micro-lens array is located at β apart from the sensor. The rays of light having different directions (orange, blue and green color) coming from the same point of object are separated by micro-lens and recorded by the sensor behind each mirco-lens.
- 2. Refocussing in light field camera
An image captured by LF camera can be used to synthesize an image having different focus, and we will briefly review it here [15] . As seen in Fig. 2 , the object is focused by the main lens on the MLA located at a distance x 0 from the main lens, and only the object located at depth z 0 will be perfectly focused according to the well-known relationship given below:
PPT Slide
Lager Image
where f is the focal length of the main lens. From equation (1), we see that if the object is at infinity ( z 0 = ∞, its in-focus image will be formed exactly at the focal length of the main lens ( x 0 = f ), otherwise its in-focus image will be located at a focal plane at x 0 with x 0 f . Different from conventional camera, LF camera allows to compute any refocused image (RFIs) at a focal plane x , with x x 0 . After an image already captured, the image corresponding to a focal plane at x 0 can be computed directly by summing all the views Iu ( s ) up to have the RFI I ( s ; x 0 ) as:
PPT Slide
Lager Image
As can see in Fig. 3(a) , the change of focus sets up the refocused focal plane at x , therefore, the summation is done among the shifted views as:
PPT Slide
Lager Image
(a) Light field image refocussed from distance z0 to z. (b) Zoom-out of the red box in (a). The difference between the focal planes before and after refocusing is Δx which can be calculated based on the design parameters of camera
PPT Slide
Lager Image
As shown in Fig. 3(b) , we use a similar triangle rule between two triangles DOC and CBA to compute the shift amount Δu required in equation (3) as below:
PPT Slide
Lager Image
where μ is the pixel pitch and D is the diameter of the micro-lens. Note that the equation (4) depends on μ , β , D , and Δx which is the difference between the original and refocusing focal planes. Now, the refocusing parameter ρ is defined as:
PPT Slide
Lager Image
Using (4) and (5), (3) can be rewritten as (6) which shows RFI’s as a function of the refocusing parameter ρ .
PPT Slide
Lager Image
Referring to (1) and noting x = x 0 Δx , the focus distance, z = xf /( x f ) can be shown as:
PPT Slide
Lager Image
Finally, by substituting Δx = ρβD/μ in (5) to (7), one can obtain the following equation:
PPT Slide
Lager Image
where a 0 = βD / μx 0 , a 1 = z 0 a 0 / f , and z 0 = x 0 f /( x 0 f ) is the in-focus distance of the camera.
The model (8) has several properties. Firstly, it is clear that when ρ = 0, the refocusing distance z = z 0 , corresponds to a case without performing any refocusing as expected. If the refocusing distance is larger than original (that is, z > z 0 ), ρ > 0, and otherwise (that is, z < z 0 ), ρ < 0. In this paper, for the Lytro Illum camera, we choose ρ = 1 which corresponds to a focus distance z at infinity and ρ = −1 which corresponds to a focus distance closest next to the camera as in [11] .
Ⅲ. Light Field Image Compression
In this section, we first describe the state-of-the-art method [18] to compress LF image based on the 4D LF representation for general applications. Under the observation that the amount of data to process is still large for application requiring only refocusing, we investigate a new framework designed especially for refocusing applications only. Its details are described as following.
- 1. Review of state-of-the-art method: 4D LF-based Compression
The 4D LF based compression scheme that we consider in this work is described in Fig. 4 . Basically, it has four main processes: format conversion, pseudo sequence conversion, compression, and rendering focal stack. Each process is explained as following.
PPT Slide
Lager Image
A state-of-the-art framework for conventional LF image coding: 4D LF based compression
The raw LF image which is of high dimensional data can have one representation among lenslet, 4D LF, or epipolar formats. Thus, at first, we need to find out the best data format which is supposed to bring the most coding efficiency. The data formats can be also either non-invertible or invertible with respect to the type of camera. In this paper, we consider a conversion from the lenslet data, captured by plenoptic camera 1.0 (i.e., Lytro camera), to 4D LF. It is invertible since one can convert to either way between two kinds of data format without losing any information. The conversion can be done by using the Light Field Matlab Toolbox [17] . I. Viola et al. [18] shows that 4D LF data format gives significantly better results in terms of coding efficiency when compared to compressing data in lenslet or epipolar data formats. Therefore, we select the 4D LF data format as the input of conventional compression scheme to compare with our proposed framework.
Subsequently, we arrange those sub-aperture images (SAIs) as a pseudo sequence following a selected scan order in such a way that the adjacent SAIs have higher similarity. A scan order can be selected to ensure each SAI is as close as possible to its reference frame, hence, the coding scheme can benefit most from inter-prediction. The work [19] investigated different scan orders such as raster scan, rotation scan, or combination of these scan orders to show that careful selection of the order can bring a significant improvement in compression. In this paper, we choose the raster scan order to form a pseudo sequence same as in [18] .
In the next step, the video sequence is coded with video encoders such as the High Efficiency Video Coding (HEVC) encoder which is the most recent video coding standard. Due to the black views in the corner, we only select 13x13 SAIs to encode. Regarding this problem, authors [20] also tried to select a different subset of SAIs (that is, 9x9, 11x11, 13x13, and 15x15 SAIs) to evaluate the quality of RFI rendered from these subset images. That selection should be of a square shape in order to perform rendering RFI’s. The results show that the selection of 13x13 SAIs has a balance between coding efficiency and quality of RFI’s. For example, if we only select a small subset of images such as 9x9 SAIs, the out-of-focus region in the RFI still contains sharp edge and has very shallow depth of field so the refocus range (i.e., the maximum extent the image can be refocused) is limited. Finally, after compression, we convert the video sequence to 4D LF format to generate RFI’s according to the shifting and adding algorithm in equation (6).
- 2. Proposed method: Focal stack based compression
As illustrated in Fig. 5 , we propose to compress the focal stack instead of 4D LF format in order to reduce the amount of data to process. The proposed scheme consists of four main processes in the same way as the conventional scheme except for the rendering step of the focal stack performed before the coding process. Encoder can reduce redundancy among SAIs in the 4D LF format. But at decoder side, we still need to reconstruct the original 4D LF data again to perform refocusing function. This means that we still need significant storage space. Compared to compression approach using 4D LF format, the conversion of the 4D LF data to the focal stack followed by direct compression extremely reduces the data amount to transmit or store. Moreover, it is not only easy to synthesize the focal stack from LF data or convert it back [14] , but the RFI’s in focal stack are also easily compressed because they contain mostly low-frequency components in the out-of-focus regions [21] .
PPT Slide
Lager Image
A proposed framework for LF image coding for refocusing applications: Focal Stack based compression
In the rendering step of the focal stack, we also generate RFI’s in the same way as a conventional approach. The number of RFI’s in the focal stack varies depending on the refocus range from the closest focus point in the foreground to the furthest one in the background. Scenes with a large depth variation will require more images, while scenes having all objects close together require fewer images. Therefore, the number of images in the focal stack should be not only large enough to cover all depth levels but also, not too many to cause overlaps of the depth of field between the RFI’s. In this paper, we investigate many different numbers of images in focal stack for various image contents to see how much it has effects on the quality of refocused image. Before encoding, we also arrange RFI’s as a pseudo sequence following the order that corresponds to the RFI’s focused from foreground to background of the image. It is noted that the most related works with our proposed method is in [21] which compresses the RFI’s using DWT-based compression for 4D LF data. Since their goal is to reconstruct the best quality of 4D LF, they also compress the residuals of SAI images as well as the RFI’s. Hence, ours is different from them. In addition, our proposed framework exploits the most recent video coding standard, HEVC, which makes us expect better compression compared with the existing work [21] .
Ⅳ. Reconstruction of Focus Map and All-in-focus Image from Focal Stack
The refocusing application enables users not only to “click and refocus” the image but also to see the all-in-focus image. From the focal stack images, we need to estimate a focus map to guide the display process of RFI’s process as well as computing the all-in-focus image. Each pixel in focus map has the value corresponding to the index of RFI which gives the best focus. Therefore, when the users click on a region that demands to be in-focus, the focus map helps to identify the corresponding image in the focal stack which has the best in-focus for the selected region. Moreover, by merging all the in-focus regions, one can reconstruct an all-in-focus image. In this paper, we utilize the method in [22] to estimate the focus map. This algorithm includes two steps. First, we detect the regions having strong gradients in each image of the focal stack and then we expand these regions by using the graph cut optimization [23] . Each RFI contains in-focus and out-of-focus regions. The in-focus region is much sharper than the out-of-focus ones. Given a focal stack having K images, for each image Ik in the focal stack, we compute a layer Fk that indicates focus score of the pixels in Ik :
PPT Slide
Lager Image
where U (∙) is the unit step function, ∇ Ik ( x,y ) is the image gradient value of Ik ( x,y ), and δk is the gradient threshold used for Ik . The focus score Fk ( x,y ) is obtained by comparing the image gradient value with a sequence of increasing threshold k , weighted with a coefficient αm . Subsequently, we use the graph cut algorithm in [23] to find the index of image where a pixel has the strongest gradient (i.e., maximum focus score). According to the scheme in [22] , we obtain a focus map l such that lx := l ( x,y ) gives the index of the focal stack image where the pixel X = ( x,y ) is in-focus. The focus map l is computed by minimizing an objective function E ( l ) of the following form:
PPT Slide
Lager Image
Here D ( X,lk ) is a data term that represents the cost of assigning the label lx to the pixel X . The data cost D is determined with respect to the sharpness scores of the pixels as:
PPT Slide
Lager Image
The second term in (10) controls the smoothness of the variation of label among neighboring pixels X and X (denote as X X ). We set the smoothness cost as:
PPT Slide
Lager Image
where K is the number of images in the focal stack. The smoothness cost increases with the label difference among neighboring pixel only at a logarithmic rate, in order to avoid the over larger differences along with directions of depth discontinuity. By minimizing the cost function E ( l ), we can obtain a focus map, which makes it possible to compute an all in-focus image, AIF ( x,y ), by following equation:
PPT Slide
Lager Image
where a focus map lx := l ( x,y ) gives the index of the focal stack image, Ik , and the pixel X = ( x,y ), is in focus.
Ⅴ. Experimental Results
In this paper, we use a set of six images captured by Lytro Illum camera, namely I01 to I06, in the EPFL light field dataset [25] . They cover many different scenes such as indoor, outdoor, people, and etc. as shown in Fig. 6 . The raw lenslet image has a size of 7728x5368 with 10 bits per pixel precision. To obtain 4D LF data, the raw lenslet images were converted by the Light Field Matlab Toolbox [17] , where each 4D LF output data contains 225 views (i.e., sub-aperture images (SAIs)) with resolution of 625x 434 pixels, 8 bits per pixel precision.
PPT Slide
Lager Image
Six images captured by Lytro Illum camera in the EPFL light field dataset [25].
We use the HEVC reference software (HM) version 16.17 [24] for encoding and decoding. QP is set from 10 to 45 to simulate various image qualities (i.e., good to bad quality). The configuration of the encoder is set as follows: GOP structure of I-B-B-B with only the first frame as a reference in order to have the best coding performance [19] . To measure the coding gain of the proposed method, Bjontegaard bitrate (BDBR) and Bjontegaard delta PSNR (BD-PSNR) [26 ] are measured against the Anchor of the 4D LF based compression approach [18] . A negative BDBR (or positive BD-PSNR) represents better coding efficiency of the proposed method compared with the anchor.
- 1. Quality Evaluation: bitrate vs. PSNR
Average PSNR is computed by averaging PSNR values of all RFI’s in the focal stack, as:
PPT Slide
Lager Image
where Ik is RFI rendered from original LF image and
PPT Slide
Lager Image
is RFI rendered from compressed LF image.
PPT Slide
Lager Image
of each RFI in the focal stack is computed as follows:
PPT Slide
Lager Image
where the mean squared error (MSE) is defined as:
PPT Slide
Lager Image
where m and n are the dimensions of the RFI (i.e., m = 625, n = 434).
In order to verify the number of images to generate in each focal stack, we tested several values of K = 11, 17, 23, 31, 41, 51. The values of K are chosen to be large enough to fully reconstruct the 4D LF from the focal stack with a good quality [27] . Since our goal is not to reconstruct the original 4D LF, we selected these values just for testing our refocusing applications. As can be seen in Table I , the bitrate decreases when the number of images in focal stack increases. However, the difference is not so large. For example, when K = 11, BDBR reduces to -59.94% compared to –40.88% when K = 51 . Even with K about five times larger than the smaller one, BDBR did not change much as expected. It is due to the properties of refocused image in which the out-of-focus regions contain much low frequency components which are easy to compress [21] .
BDBR (%) and BD-PSNR [dB] quality of images in focal stack performance against the 4D LF based compression scheme[18].
PPT Slide
Lager Image
BDBR (%) and BD-PSNR [dB] quality of images in focal stack performance against the 4D LF based compression scheme [18].
Fig. 7 shows the rate-distortion performance of the LF images I01 to I06. Since the 4D LF data format is converted to a focal stack data, obviously, our proposed compression scheme has better coding gain compared to the 4D LF based compression scheme, especially at a low bitrate. The BDBR and BD-PSNR depend on the image contents. For example, as can be seen in Fig. 6 , I02 and I03 contain a lot of texture regions giving lower bitrate reduction, where I01 and I05 images have many smooth regions which lead to better compress ratio.
PPT Slide
Lager Image
Coding efficiency comparison of the proposed method with 4D LF based compression in [18] for the Bikes LF images (The number of images in focal stack is K=11)
- 2. Quality Evaluation: focal stack vs. all-in-focus images
The refocusing applications allow users to enjoy “click-and refocus” as well as to show the all-in-focus image, which has been shown in [22] . Fig. 8 shows that the first three images on the left illustrate the image with focus on different objects in the image. Focus map has many colors corresponding to different depth layers. Users can click at any layers, and the image belonging to that layers will be displayed on the screen. In [10] , authors have verified that the estimated focus map does not have great impact on all-in-focus image creation, but it is only influenced by the distorted texture image in the focal stack. That is why in this paper we did not evaluate the quality of focus map estimated from the focal stack (i.e., focus map can be easily estimated from the focal stack even at a larger QP). In Fig. 9 , we can see that our all-in-focus image has much better quality in both PSNR and SSIM because our proposed method still maintains the detailed texture of image compared to the 4D LF compression even at a low bitrate.
PPT Slide
Lager Image
Quality comparison of RFI for the Bikes LF images: (a) original image; (b) compressed 4D LF image at 872.23Kbps; (c) compressed focal stack image with bitrate at 552.87Kbps
PPT Slide
Lager Image
Quality of all-in-focus images for the Bikes LF images generated from the focal stack in Fig. 8(PSNR [dB]/ SSIM/ bitrate [Kbps])
Ⅵ. Conclusion
In this paper, we proposed a LF compression framework mainly for refocusing applications. We first generate the focal stack and compress it as a pseudo video sequence using HEVC. The results showed that our proposed framework outperforms that of the state-of-the-art 4D LF based compression method in compression performance.
This work was supported in part by the National Research Foundation of Korea (NRF) grant (2017R1A2B2006518) funded by the Ministry of Science and ICT.
BIO
Vinh Van Duong
- 2017 : Hanoi University of Science and Technology (Vietnam)
- 2017 ~ Current : Combined Master & PhD student at Digital Media Lab, Sungkyunkwan Univerisity (Korea)
- Research interest : Video coding, Image enhancement and Light field imaging
Thuong Nguyen Canh
- 2012 : Hanoi University of Science and Technology (Vietnam)
- 2014 : Master at Digital Media Lab, Sungkyunkwan Univerisity (Korea)
- 2019 : PhD at Digital Media Lab, Sungkyunkwan Univerisity (Korea)
- 2019 ~ Current : Post-Doctoral Fellow at Institude for Datability Science, Osaka University (Japan)
- Research interest : Video coding, Compressive sensing, and Light field imaging
Thuc Nguyen Huu
- 2017 : University of Engineering and Technology-Vietnam National University (Vietnam)
- 2017 ~ Current : Combined Master & PhD student at Digital Media Lab, Sungkyunkwan Univerisity (Korea)
- Research interest : Video coding and Light field imaging
Byeungwoo Jeon
- 1985 : Department of Electronics Engineering, Seoul National University, Seoul, Korea (BS)
- 1987 : Department of Electronics Engineering, Seoul National University, Seoul, Korea (MS)
- 1992 : School of Electrical Engineering, Purdue University, West Lafayette, USA (PhD)
- 1993 ~ 1997 : Signal Processing Lab, Samsung Electronics, Suwon, Korea (Principal Research Engineer)
- 1997 ~ Current : School of Electronic Engineering, Sungkyunkwan University, Suwon, Korea (Professor)
- Research interest : Multimedia, Video Compression, Signal Processing
References
Lippmann G. 1908 Epreuves reversibles, photographies integrales Academie des sciences 446 - 451
Ives H. 1928 A Camera for Making Parallax Panoramagrams J. Opt. Soc. Am. 17 (4) 435 - 439    DOI : 10.1364/JOSA.17.000435
Levoy M. 2006 Light field microscopy ACM Trans. Graph. 25 (3) 924 - 934    DOI : 10.1145/1141911.1141976
Fife K. 2008 A 3mpixel multi-aperture image sensor with 0.7um pixels in 0.11um cmos, IEEE ISSCC Digest of Technical Papers 48 - 49
Ng R. 2005 Light field photography with a hand-held plenoptic camera Computer Science Technical Report, vol. 2, no. 11
2017 ISO/IEC JTC 1/SC29/WG1 JPEG, JPEG Pleno Call for Proposals on Light Field Coding, ISO/IEC JTC 1/SC29/WG1 JPEG Doc. N74014
Rerabek M. 2016 ICME 2016 grand challenge: Light-field image compression, Call for proposals and evaluation procedure
Viola I. , Ebrahimi T. 2018 Quality Assessment of Compression Solutions for ICIP 2017 Grand Challenge on Light Field Image Coding Proc. 9th Workshop on Hot Topics in 3D Multimedia
ICME 2018 Grand Challenge on Densely Sampled Light Field Reconstruction available at
Rizkallah M. 2016 Impact of light field compression on refocused and extended focus images Proc. European Conf. Sig. Process. 898 - 902
Perra C. 2018 Effect of light field subsampling on the quality of experience in refocusing applications Proc. IEEE Int. Conf. Quality Multi. Exper.
Paudyal P. 2017 Towards the perceptual quality evaluation of compressed light field images IEEE Trans. Broadcast. 63 (3) 507 - 522    DOI : 10.1109/TBC.2017.2704430
Kara P. A. 2018 Evaluation of the Concept of Dynamic Adaptive Streaming of Light Field Video IEEE Trans. Broadcast. 64 407 - 421    DOI : 10.1109/TBC.2018.2834736
Levin A. , Durand F. 2010 Linear views sythesis using a dimensionality gap light field prior Proc. IEEE Conf. Comp. Vision Pattern Recog. 1831 - 1838
Pertuz S. 2018 Focus model for metric depth estimation in standard plenoptic cameras SPRS Journal of Photogrammetry and Remote Sensing 144 38 - 47    DOI : 10.1016/j.isprsjprs.2018.06.020
Levoy M. , Hanrahan P. 1996 Light field rendering SIGGRAPH 31 - 42
Dansereau D. Light Field Toolbox v0.4 available at
Viola I. 2017 Comparison and evaluation of light field image coding approaches IEEE J. Select. Topic Sig. Process. 11 (7) 1092 - 1106    DOI : 10.1109/JSTSP.2017.2740167
Canh T. N. , Duong V. V. , Jeon B. 2019 Boundary handling for video based light field coding with a new hybrid scan order Proc. Inter. Workshop on Advanced Image Tech. 1 - 4
Duong V. V. , Canh T. N. , Jeon B. 2018 Light Field Image Coding for Efficient Refocusing Proc. IEEE 12th Inter. Symp. Emb. Multi./Many-core Sys. On Chip 74 - 78
Sakamoto T. 2016 A study on efficient compression of multi-focus images for dense light field reconstruction Proc. Vis. Commum. Image Processing(VCIP) 1 - 4
Mousnier A. Partial light field tomographic reconstruction from a fixed camera focal stack available at
Boykov Y. , Veksler O. , Zabih R. 2001 Efficient approximate energy minimization via graph cuts IEEE Trans. on Pattern Analysis and Machine Intelligence 20 (12) 1222 - 1239
HEVC reference software, HM 16.17 available at
Rerabek M. , Ebrahimi T. 2016 New light field image dataset Proc. 8th Int. Conf. Quality Multi. Exp. 1 - 2
Bjontegaard G. 2001 Calculation of Average PSNR Differences Between RDCurves, ITU-T SG16 Q.6, VCEG-M33 Geneva, Switzerland
Perez F. 2016 Lightfield recovery from its focal stack J. Math. Imag. Vis. 56 (3) 573 - 590    DOI : 10.1007/s10851-016-0658-4