Advanced
An Improved Saliency Detection for Different Light Conditions
An Improved Saliency Detection for Different Light Conditions
KSII Transactions on Internet and Information Systems (TIIS). 2015. Mar, 9(3): 1155-1172
Copyright © 2015, Korean Society For Internet Information
  • Received : May 27, 2014
  • Accepted : February 27, 2015
  • Published : March 31, 2015
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Yongfeng Ren
Faculty of Computer Engineering, Huaiyin Institute of Technology Huai’an, 223003, P. R. China
Jingbo Zhou
Faculty of Computer Engineering, Huaiyin Institute of Technology Huai’an, 223003, P. R. China
Zhijian Wang
College of Computer and Information, Hohai University, Nanjing, 211100, P. R. China
Yunyang Yan
Faculty of Computer Engineering, Huaiyin Institute of Technology Huai’an, 223003, P. R. China

Abstract
In this paper, we propose a novel saliency detection framework based on illumination invariant features to improve the accuracy of the saliency detection under the different light conditions. The proposed algorithm is divided into three steps. First, we extract the illuminant invariant features to reduce the effect of the illumination based on the local sensitive histograms. Second, a preliminary saliency map is obtained in the CIE Lab color space. Last, we use the region growing method to fuse the illuminant invariant features and the preliminary saliency map into a new framework. In addition, we integrate the information of spatial distinctness since the saliency objects are usually compact. The experiments on the benchmark dataset show that the proposed saliency detection framework outperforms the state-of-the-art algorithms in terms of different illuminants in the images.
Keywords
1. Introduction
S aliency detection, the task to detect objects attracted by the human visual system in an image/video, has potential applications in computer vision and multimedia tasks, such as object recognition [1] , image resizing [2] , image retrieval [3] , automatic video-to-comics conversion [24] , automatic multimedia tagging [25] , video accessibility enhancement [26] , visual attention [27] , and content-aware image editing [4] .
Saliency models have been developed for top-down based and bottom-up based approaches. The former is related to the recognition processing influenced by the prior knowledge such as tasks to be performed, the feature distribution of the target, the context of the visual scene and so on [5 - 7 , 28] . The latter, which is data-driven and task-independent, is approached mainly by three steps, namely, feature selection, saliency calculation, and map normalization [9 - 17] . First, low-level features, such as color, intensity, orientation and motion, are selected as the basic elements for supporting saliency detection. Second, the saliency value for each pixel in an input image is computed according to a predefined model. In the end, saliency maps obtained from different sources are integrated and normalized to get a final result. In this paper, we focus on the bottom-up salient object detection tasks.
There have been several studies related to saliency detection in recent years. Itti et al. [8] defined a salient region by counting the differences in the areas surrounding the central region across multi-scales images. Ma and Zhang [9] used a fuzzy growth model to generate saliency maps. Harel et al. [10] combined the saliency maps of Itti et al. [8] with other feature maps to highlight the distinctive regions of an image. Hou and Zhang [11] constructed a saliency map by extracting the spectral residual of an image in the spectral domain. Achanta et al. [12] determined the salient regions in images using low-level luminance and color features. Recently, Goferman et al. [4] considered both local and global features to highlight the salient objects enhanced by means of visual organization. Moreover, Achanta et al. [13] proposed a frequency-tuned method that defines the pixel saliency based on a frequency-domain analysis and the differences in color from the average image color. Zhai and Shah [14] define the pixel-level saliency by constructing spatial and temporal attention models. Cheng et al. [15] used histogram- and region-based contrast to compute saliency maps with high precision and recall rates.
With the development of computer and algorithm, the effects of saliency detection have been improved tremendously. However, the illumination in the image in practical application is not always as ideal as the ones in the datasets in this research field. Many of the existing saliency detection methods do not attach great importance to the illumination problem. However, detecting salient objects in the same images which with different illumination is always a challenge task. For example, we simulate the different illumination conditions and run the codes of the early works, i.e., RC [15] and SDSP [16] , on MSRA 1000 database [13] . Fig. 1 . shows the results. Fig. 1 (a4) is the normal illuminant in the aforementioned database. We reduce the light gradually and demonstrate them as Fig. 1 (a1) ~ (a3). Similarly, Fig. 1 (a5) ~ (a8) is showed as bright image. It can be seen that both RC and SDSP, which represented as Fig. 1 (b2) and Fig. 1 (b3) respectively, are influenced by the case of illumination in images. Since the preferable light condition is very difficult to guarantee in application, it is very necessary to improve the accuracy of saliency detection under the worse illumination. In this paper, we propose a novel saliency detection framework based on illumination invariant features to improve the accuracy of the saliency detection under the different light conditions.
PPT Slide
Lager Image
Examples of the saliency maps in different illumination conditions.
In this paper, our work includes two parts mainly. First, we add the information of different illumination conditions to a publicly available database MSRA1000 provided by Achanta et al. [13] which includes 1000 images. As we known, this database is the largest and the best of the image datasets to research the saliency detection, and has ground truth in the form of accurate human-marked labels for salient regions. Each image in the dataset is changed to different light condition by reducing or increasing the light gradually. Second, we propose a new algorithm to detect the salient object based on illumination invariant features [18] . The proposed algorithm extracts the illumination invariant features from the given image and uses the region growing method to detect the salient regions. Since the illuminant invariant features are independent of the light and retain the saliency information of the original image, we can exploit this characteristic to improve the precision of saliency detection under different illuminations.
The contributions of this paper are summarized as follows:
(1) We propose a new framework for the issue of reducing the impact of illumination in saliency detection, since it is not discussed before as we know. In our proposed framework, we exploit illumination invariant features (IIF) which extracts the invariant features efficiently and ignores the illumination conditions of the image as shown in Fig. 2 ;
PPT Slide
Lager Image
Diagram of our proposed model
(2) The proposed algorithm fuses some priors, such as color distinctness and spatial distinctness into a framework to improve the accuracy of the saliency detection. According to the mechanism of the human visual system that warm colors, such as red and yellow, are more pronounced to our eyes than cold ones, such as green and blue, we propose a simple method to model it by analyzing the color space of the image.
The remainder of the paper is organized as follows: we state the foundations in section 2. In Section 3, we demonstrate framework of our saliency detection method in detail. Then, we demonstrate our experimental results based on three public image datasets and compare the results with other state-of-the-art saliency detection methods in Section 4. The final section concludes the paper by summarizing our findings.
2. Illumination invariant features
First of all, we introduce the illumination invariant features (IIF) [18] which can help us to find the invariant features in the images ignoring the illumination conditions.
Let Ip and I' p denote the intensity values of pixel p before and after an affine illumination change. There is a relationship between them:
PPT Slide
Lager Image
where a 1 , p and a 2 , p are two parameters of the affine transform Ap at pixel p. Let
PPT Slide
Lager Image
denote the histogram computed from a window Sp centered at the pixel p, and bp denote the bin corresponding to the intensity value Ip . According to the definition of the histogram, the number of pixels in Sp whose intensity value resides in [ bp - rp , bp + rp ] is:
PPT Slide
Lager Image
where parameter rp controls the interval of integration at pixel p. We denote
PPT Slide
Lager Image
where k = 0.1is a constant,
PPT Slide
Lager Image
is the mean intensity value of window Sp and | Sp | is the number of pixels in Sp . With an additional assumption that the affine illumination changes are locally smooth so that the affine transform is the same for all pixels inside window Sp . The interval of integration at pixel p can be expressed as follows:
PPT Slide
Lager Image
The integrated value r'p obtained under a different illumination condition corresponds to the number of pixels with intensity value resides in
PPT Slide
Lager Image
Ignoring the quantization error Sp is equal to S'p . Thus Sp is independent of affine illumination changes and can be used as a matching invariant under different illumination conditions.
In practice, it is inaccurate to define an exact local window inside which the affine illumination transform remains unchanged. Hence, we replace histogram
PPT Slide
Lager Image
in equation (2) with the locality sensitive histogram
PPT Slide
Lager Image
which adaptively is taken into account the contribution from all image pixels [18] . In addition, we use a “soft” interval to reduce the quantization error. When
PPT Slide
Lager Image
In experiments, a 2 ,p is relatively small. Thus rp can be replaced by κIp . Sp becomes:
PPT Slide
Lager Image
The examples of extracting the illumination invariant features are shown in Fig. 3 (b1) ~ (b5). Since we ignore a 2 ,p when Sp is computed, the illumination invariant features will have many errors when the light is changed greatly as shown in Fig. 3 (b6). However, the light condition in Fig. 3 (b6) rarely happens in practice. Therefore, the method can be used to improve saliency detection for different light conditions.
PPT Slide
Lager Image
Examples of extracting the IIF. The images on the top are shown under different illumination conditions and their transitional image of their illumination invariant features are shown below them.
3. Proposed saliency framework
Based on the discussion in section 2, we propose the novel algorithm to improve saliency detection for different light conditions. In the proposed algorithm, we first compute the salience maps by some priors, such as color distinctness and spatial distinctness [8] . Then, we compute the saliency map based on illumination invariant features by the method which is introduced in section 2. Lastly, we obtain the saliency maps by fusing the information generated by the first and the second step.
In this proposed saliency framework, the color distinctness is considered first. Some studies [19] conclude from daily experiences that warm colors, such as red and yellow, are more pronounced to the human visual system than cold ones, such as green and blue. In this paper, we propose a simple yet effective method to model this prior.
As stated before, we convert RGB color space to CIE Lab color space. Lab color space is an opponent color system, in which a-channel represents green-red information while b-channel represents blue-yellow information. If a pixel has a smaller (greater) a value, it would seem greenish (reddish). With the same manner, if a pixel has a smaller (greater) b value, it would seem bluish (yellowish). Hence, if a pixel has a higher a or b value, it would seem “warmer”; otherwise, it would seem “colder”.
Based on the aforementioned analysis, we devise a metric to evaluate the “color saliency” for a given pixel. At first, we perform linear mappings fa ( x ) :↦ fan ( x ) ∈[0,1] and fb ( x ) :↦ fbn ( x ) ∈[0,1] by
PPT Slide
Lager Image
PPT Slide
Lager Image
where min a ( max a ) is the minimum (maximum) value of { fa ( x ) :| x ∈ Ω} and min b ( max b ) is the minimum (maximum) value of { fb ( x ) :| x ∈ Ω}. Thus, each pixel x can be mapped to one point in the color plane ( fan , fbn ) ∈ [0,1] x [0,1]. Intuitively, in this color plane, the point ( fan , fbn ) is the “coldest” point and thus it is the “least salient” one. Therefore, we define the color saliency of a point x in a straightforward manner as
PPT Slide
Lager Image
where σc is a parameter.
After computer the color distinctness, a general area of the saliency map in the current image can be got. But the result of this step is less accurate for lack of the information of spatial distinctness. In some studies, a conclusion that objects near the image center is more attractive to human than others [20] have been concluded. That implies spatial near the center of the image will be more likely to be “salient” than the ones far away from the center. Therefore we generate a prior map using a Gaussian distribution based on the distances of the pixels to the image center in this paper, such as
PPT Slide
Lager Image
where σI is a parameter, d(x, c) is the distance between point x and c.
According to the above methods, three results will be obtained. SP is the illumination invariant features , SC is the color saliency , and SI is the spatial saliency. To combine the saliency maps generated from different cues, we exploit the Bayesian method similar to [21] . Let p ( xf | Sj ) be saliency map such ones as side cues, the fusion map would be:
PPT Slide
Lager Image
where Z is chosen in a way that the final map is a probability density function (pdf).
At last the region growing algorithm [23] is used to find the salient image regions based on the pixels of the outline in the transitional image of the illumination invariant features.
An example of saliency maps based on illumination invariant features is shown in Fig. 4 . The first and second rows are the images with normal illumination. The third and fourth are the images with 0.8 times of the illumination compared with the first and second rows. Fig. 4 (a) are the images in database of the MSRA 1000, and Fig. 4 (b) are their binary ground truth. We compute the color and spatial distinctness which is shown in Fig. 4 (c) and Fig. 4 (d) respectively. The illumination invariant features mentioned in section 2 is shown in Fig. 4 (e) and accordingly saliency map is shown in Fig. 4 (f). Last, we fuse the color distinctness, spatial distinctness and the saliency map based on IIF into a final saliency map which is shown in Fig. 4 (g). From the comparison, we can see that the saliency map which added the information of illumination invariant features is more accurate than that without this information.
PPT Slide
Lager Image
Examples of saliency maps based on illumination invariant features.
4. Experiments and analysis
In order to verify the proposed method, we have evaluated the results of our approach on the publicly available database provided by Achanta et al. [13] . As we know, the database is the largest and best of the image datasets to research the saliency detection, and has ground truth in the form of accurate human-marked labels for salient regions. The experiments include two aspects. First, we add the information of illumination to the image database and compare our method with other seven state-of-the-arts saliency detection methods, including FT [13] , RC [15] , LC [14] , HC [15] , SDSP [16] , SR [11] , GBVS [10] on this database. Second, we also evaluate our algorithm and state-of-art saliency detection methods under normal illumination for the purpose of fair comparison. To evaluate the results of saliency maps generated by the methods aforementioned, we use the precision-recall curve and the F-measure similar to [13 , 15] .
- 4.1 Datasets
Since there is no image database available for us to test the effect of illumination on saliency detection, we first build an image database on the MSRA 1000. For the color space of RGB having not a direct correlation with the information of the light, we transfer the RGB color space to CIE Lab color space. Lab color space is an opponent color system, in which L-channel represents light information. We reduce or increase the value of L-channel, and then we translate the CIE Lab color space to RGB color space.
Those operations are the simulation to the change of the light from every image into 20 images under different illumination. The Fig. 5 shows the 11 images of the 20 images under different illumination based on an example. The images in the Fig. 5 can be observed by human well. The other 9 images which is under the illuminant conditions as 0 ~ 0.4 or 1.6 ~ 2.0 times as normal ones are too dark or too bright to be observed by human. In Fig. 5 , (f) is the images under the normal illuminant condition; (a) ~ (e) is under the illuminant conditions as 0.5~ 0.9 time as normal illumination; and (g) ~ (k) is under the illuminant conditions as 1.1~ 1.5 time as normal illumination. From those images, we can find that the angle of the light remains the same when the light is changing. However, the effect of this method is close to the actual situation to the greatest extent. By modifying the illumination in each image, we can expand the image dataset what is the basement to the experiments in this paper.
PPT Slide
Lager Image
Examples of images under different illuminant conditions.
- 4.2 Evaluating measure
The results generated by the proposed model are evaluated in two different ways. For the first evaluation, a fixed threshold within [0,255] is used to construct a binary foreground mask from the saliency map. Then, the binary mask is compared with the ground truth mask to obtain a precision-recall (PR) pair. To compare the proposed model with others, we always use the precision value and recall value to measure different algorithms, for the precision value is the ratio of the correctly detected region over the whole detected region, and recall is calculated as the ratio of correctly detected salient region to the ground-truth salient region. We vary the threshold over its entire range to obtain the PR curve for one image. The average precision-recall curve is obtained by averaging the results from all testing images. For the second evaluation, we follow [13] to segment a saliency map by an adaptive threshold, i.e.,
PPT Slide
Lager Image
where S ( x , y )is the saliency value in position (x, y), H and W represent the high and width respectively. If the saliency value of a pixel is larger than threshold, the pixel is considered as foreground. In many applications, high precision and high recall are both required. We thus estimate the F-Measure [15] as:
PPT Slide
Lager Image
If the saliency of a pixel is larger than the threshold, the pixel is considered as foreground. In many applications, high precision and high recall are both required. We thus estimate the F-Measure as equation (12) where we set β 2 = 0.3 to emphasize the precision.
- 4.3 The comparison under normal illuminant condition
In this subsection, we show the experimental results when comparing the proposed method with state-of-the-art saliency detection methods on MSRA 1000.
Visual comparison of saliency maps obtained by our method SDDLC and other algorithms are shown as Fig. 6 , Fig. 7 and Table 1 . In Fig. 6 , the results show that the saliency maps generated by SDDLC, RC [15] , HC [15] and SDSP [16] , are always better than FT [13] , LC [14] , SR [11] and GBVS [10] . In Fig. 7 , the precision and the recall curve of our method SDDLC is higher than those of the other methods. It is indicated that the saliency maps computed by our method are smoother and containing more pixels with the saliency threshold 255. At last, we compute the F-measure of different algorithms according to equation (10). The results are shown in Table 1 . It can be seen that the F-measure of SDDLC, RC, HC and SDSP are all higher than 0.7. In addition, the results of FT, LC, SR and GBVS are all less than 0.6.
PPT Slide
Lager Image
Examples of saliency maps by different algorithms under normal illuminant condition. (a) input images; (b) SDDLC; (c) FT[13]; (d) RC[15]; (e) LC[14]; (f) HC[15]; (g) SDSP[16]; (h) SR[11]; (i) GBVS[10]
PPT Slide
Lager Image
Precision and recall for all algorithms under normal illuminant condition.
F-measure for each algorithm
PPT Slide
Lager Image
F-measure for each algorithm
The models of FT, LC, SR and GBVS generally detect the foreground from input images. However, they are easy to influence the saliency maps of them by the background that the salient area contains not only salient object but also clutter background. The model of HC gets saliency maps by the contrast between the colors. The information of colors is more than that of the gray image. The results generated by HC showed in Fig. 6 (f) and Fig. 7 are preferable. However, it is influenced by the fine objects since the spatial information is out of consideration. The model of RC showed in Fig. 6 (d) and Fig. 7 is added the information of the size and location of salient area based on the model of HC. Thus its results of saliency map are better than HC. However, similar to FT, it is also influenced by the clutter background. The reason might be that RC compute color contrast based on color histogram to measure the difference between two regions. It is failed when the regions located in the salient object and in background having the same color histogram. From the fifth to eighth rows in Fig. 6 (d), the salient regions detected by RC contain not only the salient object, but also the clutter background.
The model of SDSP [16] showed in Fig. 6 (g) and Fig. 7 generates saliency maps by the contrast of L-channel, a-channel and b-channel in the CIE Lab color space which contains the information of the light and the colors. Therefore, the saliency map of SDSP is better than that of RC when the salient regions have the same color histogram as the background. Our method SDDLC mixes the information of illumination invariant features to reduce the effect of the illumination and emphasize the salient object in each image. The results of the proposed method are showed in Fig. 6 (b) and Fig. 7 . We can find that the proposed algorithm outperforms other state-of-the-art methods under normal illumination.
- 4.4 The comparison under different illuminant conditions
In this section, we compare the results of the proposed algorithm with the other algorithms under different illuminant conditions. The results of different algorithms under different illuminant conditions are shown in Fig. 8 . In addition, we use the precision and recall to measure the results generated by the algorithms mentioned before. The comparison is shown in Fig. 9 . At last, we compute the F-measure of the results of different algorithms which is shown in Table 2 .
PPT Slide
Lager Image
Saliency maps by different methods under different illuminant conditions. (a) input images ; (b) SDDLC; (c) FT [13]; (d) RC [15]; (e) LC [14]; (f) HC [15]; (g) SDSP [16]; (h) SR [11]; (i) GBVS[10]
PPT Slide
Lager Image
Precision and recall by different methods. (a) 0.5 times of normal illuminant conditions; (b) 1.5 times of normal illuminant conditions; (c) is the average precision and average recall by different methods under different illuminant conditions.
F-measure by different methods for different illumination
PPT Slide
Lager Image
F-measure by different methods for different illumination
It can be seen that the results generated by our method SDDLC is more robust than that by other methods in Fig. 8 . For FT, SR and LC, they are affected by not only light changes but also the noises of the background. GBVS, which seems robust to the variety of the illumination, can not detect the object accurately. RC and SDSP, which is comparable to the proposed algorithm, are heavily influenced by the mess of the background. Fig. 9 (a) and (b) are the precision versus recall curves for all algorithms under 0.5 and 1.5 of normal illuminant conditions. When the illuminant conditions become higher than 1.5 or less than 0.5, some algorithms would fail to detect the salient areas. Hence, the extreme illuminant conditions can test the saliency map of all algorithms completely. As shown in Fig. 9 (a), the precision and the recall curve of our method SDDLC is higher than those of other methods. For the other models, the precision and the recall of the method HC is worse than other methods when the light is 0.5 times of normal illustration or low. In Fig. 9 (b), our method SDDLC is higher than those of the other methods. SR is poor than those of the other methods when detect the salient objects in bright images. In addition, LC becomes the most unstable when the light is 1.5 times of normal illumination. Fig.9 (c) is the result that compares average precision versus average recall curves for all algorithms under 0.1~2.0 times of normal illuminant conditions. It can also be seen that HC is poor in both precision and recall when comparing with those of the other methods. The precision and the recall curve of our method SDDLC is higher than those of the other methods since the saliency maps computed by our method are smoother and contain more pixels with the saliency value 255.
We compute the F-measure of different algorithms under 0.1~2.0 times of normal illuminant conditions in Table 2 and Fig. 10 . As shown in Table 2 , all algorithms except FT, LC and SR can detect the salient regions accurately. With the light becomes higher or lower, the F-measure of each algorithm decreases gradually. As shown in the Fig. 10 , the results of the proposed method outperform other methods when the image with dark illumination. With the image becomes bright, the precision and recall decay faster. For other methods, the results of RC and HC decline fastest since both RC and HC are dependent on the color histogram. As we know, color histogram is affected by the light changing severely [22] . When we select the illumination as 0.1~0.2 and 1.9~2.0 times of normal illumination, the results of RC and HC are already close to zero. However, the results of our method SDDLC and GBVS are robust to the light changing when compared with other models.
PPT Slide
Lager Image
F-measure by different methods under different illuminant conditions
In Fig. 11 , we show the average precision, the average recall and the average F-measure by different methods with the illumination under 0.1~2.0 times of normal illuminant conditions. Generally speaking, the precision indicates the performance of the saliency detection algorithms compared with ground-truth saliency map. To compare the proposed model with others, we always see the precision value for different algorithms, for the precision value is the ratio of the correctly detected region over the whole detected region. From Fig. 11 , we find out that the performance of our algorithm is close to that of SDSP, but be better than others.
PPT Slide
Lager Image
Average precision, recall and F-measure by different methods under different illuminant conditions
5. Conclusion
We propose a novel saliency detection framework based on illumination invariant features to improve saliency detection for different light condition. In our framework, an input image is extracted from the illuminant invariant features that can help us to reduce the effect of the illumination to set up a transitional image based on the local sensitive histograms. Then a preliminary saliency map of the image is obtained in the CIE Lab color space. We use the method of the region growing to fuse the illuminant invariant features and the preliminary saliency map with the information of spatial distinctness lastly. We evaluate our method on an image dataset based on a publicly available datasets and compare our scheme with the state-of-art models under the normal and revised illustration conditions respectively. The resulting saliency maps are much less sensitive to background texture under the normal illustration condition.
Our future work will focus on high-level knowledge, which could be beneficial from handling more challenging cases and other kinds of saliency cues or priors to be embedded into our framework.
BIO
Yongfeng Ren is a doctoral student majoring in College of Computer and Information at Hohai University from September 2012 to present. He is a lecturer at Huaiyin Institute of Technology. His research interests include pattern recognition, image processing, etc.
Jingbo Zhou received his PhD degree in control science and engi-neering from Nanjing University of Science and Technology (NUST) in 2013. He is a lecturer at Huaiyin Institute of Technology. His researchinterests include pattern recognition, image processing, etc.
Zhijian Wang received his BS degree in Software Engineering from Nanjing University in 1982. He received his MS degree in Software Engineering from Nanjing University in 1986 and his PhD degree from from Nanjing University, on the subject of Software Engineering, in 1989. His current research interests include pattern recognition, formal method, web service, software engineering, software testing, software component and water resources information study.
Yunyang Yan received his BS degree in computer application from Nanjing University of Aeronautics and Astronautics in 1988. He received his MS degree in computer application from Southeast Uni-versity in 2002 and his PhD degree from NUST, on the subject of pat-tern recognition and intelligence systems, in 2008. His current research interests include pattern recognition and computer vision.
References
Rutishauser U. , Walther D. , Koch C. , Perona P. “Is bottom-up attention useful for object recognition?” IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2004 vol. 2 37 - 44
Chen L. Q. , Xie X. , Fan X 2003 “A visual attention model for adapting images on small displays,” Multimedia systems 9 (4) 353 - 364    DOI : 10.1007/s00530-003-0105-4
Loupias E. “Wavelet-based salient points for image retrieval,” IEEE in Proc. of Image Processing, Proceedings, International Conference on 2000 vol. 9 518 - 521
Goferman S. , Zelnik-Manor L. , Tal A. 2012 “Context-aware saliency detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (10) 1915 - 1926    DOI : 10.1109/TPAMI.2011.272
Itti L. , PhD thesis 2000 “Models of Bottom-Up and Top-Down Visual Attention,” California Institute of Technology Pasadena PhD thesis
Kanan C. , Tong M. , Zhang L. , Cottrell G. 2009 “SUN: Top-down saliency using natural statistics,” Visual Cognition 17 (6) 979 - 1003    DOI : 10.1080/13506280902771138
Lu Z. , Lin W. , Yang X. , Ong E. , Yao S. 2005 “Modeling visual attention's modulatory aftereffects on visual sensitivity and quality evaluation,” IEEE Transactions on Image Processing 14 (11) 1928 - 1942    DOI : 10.1109/TIP.2005.854478
Yu Jun , Rui Yong , Tang Yuan Yan , Tao Dacheng 2014 “High-Order Distance-Based Multiview Stochastic Learning in Image Classification,” IEEE Transactions on Cybernetics 44 (12) 2431 - 2442    DOI : 10.1109/TCYB.2014.2307862
Ma Y-F , Zhang H “Contrast-based image attention analysis by using fuzzy growing,” MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia 2003 374 - 381
Harel J. , Koch C. , Perona P. “Graph-based visual saliency,” Advances in neural information processing systems 2007 545 - 552
Hou X. , Zhang L. “Saliency detection: a spectral residual approach,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2007 1 - 8
Achanta R. , Estrada F. , Wils P. , Susstrunk S. “Salient region detection and segmentation,” in Proc. of ICVS Santorini, Greece 2008 vol. 5008 66 - 75
Achanta R. , Hemami S , Estrada F , Süsstrunk S “Frequency-tuned salient region detection,” IEEE conference on computer vision and pattern recognition (CVPR) Miami Beach 2009 1597 - 1604
Zhai Y. , Shah M. “Visual attention detection in video sequences using spatiotemporal cues,” in Proc. of MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia 2006 815 - 824
Cheng M. “Global contrast based salient region detection,” IEEE International Conference on Computer Vision (CVPR) 2011 409 - 416
Zhang L. , Gu Z. , Li H. “SDSP: A novel saliency detection method by combining simple priors,” in Proc. of Image Processing (ICIP), 2013 20th IEEE International Conference on 2013 171 - 175
Zhou J. , Jin Z. 2013 “A New Framework for Multiscale Saliency Detection Based on Image Patches,” Neural Processing Letters 38 (3) 361 - 374    DOI : 10.1007/s11063-012-9276-3
He S. “Visual Tracking via Locality Sensitive Histograms,” in Proc. of IEEE International Conference on Computer Vision (CVPR) 2013 409 - 416
Shen X. , Wu Y. “A unified approach to salient object detection via low rank matrix recovery,” IEEE International Conference on Computer Vision (CVPR) 2012 853 - 860
Judd T. , Ehinger K. , Durand F. , Torralba A. “Learning to predict where humans look,” Computer Vision, 2009 IEEE 12th International Conference on (ICCV) 2009 2106 - 2113
Borji A. , Sihite D. N. , Itti L “Salient object detection: A bench-mark,” Springer Berlin Heidelberg in Proc. of 12th European Conference on Computer Vision (ECCV) 2012 414 - 429
Drew Mark S. “Illumination-invariant color object recognition via compressed Chromaticity histograms of color-channel-normalized images,” in Proc. of Computer Vision, 1998. Sixth International Conference on (ICCV) 1998 1 - 8
Sen Y , Qian Y , Avolio A 2014 “Image Segmentation Methods for Intracranial Aneurysm Haemodynamic Research,” Journal of Biomechanics 47 (5) 1014 - 1022    DOI : 10.1016/j.jbiomech.2013.12.035
Wang M. , Hong R. , Yuan X.-T. , Yan S 2012 ”Movie2Comics: towards a lively video content presentation,” IEEE Transactions on Multimedia IEEE 14 858 - 870    DOI : 10.1109/TMM.2012.2187181
Wang M , Ni B , Hua X-S 2012 “Assistive tagging: A survey of multimedia tagging with humancomputer joint exploration,” ACM Comput Surv (CSUR) 44 (4) 25 -    DOI : 10.1145/2333112.2333120
Hong R. , Wang M. , Xu M. , Yan S , Chua “Dynamic captioning: video accessibility enhancement for hearing impairment,” in Proc. of ACM multimedia 2010 421 - 430
Chen Yanxiang , Nguyen Tam V. , Kankanhalli Mohan 2014 “Audio Matters in Visual Attention,” IEEE Transactions on Circuits and Systems for Video Technology 24 (11) 1992 - 2003    DOI : 10.1109/TCSVT.2014.2329380
Hong RC , Wang M , Yuan XT , Xu MD , Jiang JG , Yan SC , Chua TS 2011 “Video accessibility enhancement for hearing-impaired users,” ACM Trans Multimed Comput, Commun Appl 7: Article 24