Advanced
Content Based Image Retrieval Using Combined Features of Shape, Color and Relevance Feedback
Content Based Image Retrieval Using Combined Features of Shape, Color and Relevance Feedback
KSII Transactions on Internet and Information Systems (TIIS). 2013. Dec, 7(12): 3149-3165
Copyright © 2013, Korean Society For Internet Information
  • Received : September 04, 2013
  • Accepted : November 24, 2013
  • Published : December 30, 2013
Download
PDF
e-PUB
PubReader
PPT
Export by style
Share
Article
Author
Metrics
Cited by
TagCloud
About the Authors
Yasmin Mussarat
COMSATS Institute of Information Technology Pakistan
Sharif Muhammad
COMSATS Institute of Information Technology Pakistan
Mohsin Sajjad
COMSATS Institute of Information Technology Pakistan
Irum Isma
COMSATS Institute of Information Technology Pakistan

Abstract
Content based image retrieval is increasingly gaining popularity among image repository systems as images are a big source of digital communication and information sharing. Identification of image content is done through feature extraction which is the key operation for a successful content based image retrieval system. In this paper content based image retrieval system has been developed by adopting a strategy of combining multiple features of shape, color and relevance feedback. Shape is served as a primary operation to identify images whereas color and relevance feedback have been used as supporting features to make the system more efficient and accurate. Shape features are estimated through second derivative, least square polynomial and shapes coding methods. Color is estimated through max-min mean of neighborhood intensities. A new technique has been introduced for relevance feedback without bothering the user.
Keywords
1. Introduction
I mages are an immensely important part of today’s digital world of communication. Initially image search according to user interest were performed through text and tags associated with images and this was known as text based image retrieval. But this has changed and today, image retrieval is carried out according to visual contents of image which is known as content based image retrieval (CBIR). Countless computer applications exist which involve image searching, image matching and image retrieval, for example trade mark or logo searching, signature matching, documents matching and so on. Due to the multiplicity of applications of content based image retrieval, there always remained room for further improvement and accordingly the topic has attracted tremendous amount of research. More and more efforts are made to bring about improvements in the area since the last few decades.
CBIR presents the concept of image searching by inputting an image as query image and extracting out those images from the repository which are similar in content to the query image. Image features include shape, texture and color or combination of these. Color and texture features are more close to human perception and are high level features whereas shape is a structural and is, therefore, considered a low level feature. Performance issues include retrieval time, good balance of precision and recall and semantic gap. All these issues are highly dependent on feature extraction process. An effective feature extraction method produces successful results. It has been observed that among all the features, shape is the most powerful because it identifies the shapes present in an image. Relevance feedback is another technique used for interactive image retrieval or enhancing user experiences. In short, all features have their own roles and advantages and by making a combination of these features, their individual advantages and characteristics can be utilized collectively. A combination of multiple features like shape, color and relevance feedback can produce remarkable retrieval results [1 - 3] . In the above backdrop, a combinational approach of multiple features of image has been adopted in this paper. Shape description has been used as a primary step whereas color and relevance feedback have been added as support for the sake of user perception. Some related work of combining multiple features for content based image retrieval has been given in Section 2.
The order of the succeeding sections is arranged like this: existing work is presented in Section 2, proposed work is introduced in Section 3, results are presented in Section 4, conclusion is drawn in Section 5 and references are given in Section 6.
2. Related Work
Many researchers have made efforts to improve CBIR. Of such efforts, some signal works are spelled out here. An effective method for image retrieval by combining multiple features of color, texture and shape was presented in [4] . An algorithm for image retrieval was proposed in [5] which combined color and texture features. Here, circular region energy was used to index images in which color feature was extracted by low frequency band of wavelet transform and texture feature was extracted through high frequency band of wavelet transform. [6] addressed the problems faced in multiple features indexing into a database. In [7] a combination of shape and color were used and features were extracted after getting the region of interest (ROI) segmentation. [8] combined shape and color and obtained color features by HSI Hue value and shape information by curvature scale space (CSS). Relevance feedback was combined with texture, shape and color in [9] where color information was obtained by cumulative histogram, texture was determined by color co-occurrence matrix (CCM) and shape information was described by edge detection. A combination of texture, color and shape was also used in [10] in an optimized manner. Some specialized examples of multiple features image retrieval could also be found in [11 , 12] .
As this study follows a combinational approach for feature selection, therefore related work presented in this section has been selected from the techniques where multiple features are combined. The proposed approach presents a unique and novel method for defining features; worked out from low level instead of direct application of already existing feature definitions. The proposed technique uses least square polynomial to estimate curves in image and self defined coding mechanism for shape description, self defined max-min average for color description and a modified relevance feedback. In this way, a detection of image content is done which retrieves similar images more accurately and effectively compared to the related existing techniques. In the next section, the proposed method is discussed in greater detail.
3. Proposed Work
Generally, the task of content based image retrieval (CBIR) is divided into such steps as feature extraction, database population, query representation, similarity measurement and retrieving similar images as shown in Fig. 1 . The proposed method follows this general flow and uses the combination of shape, color and relevance feedback for feature extraction. Feature extraction is the backbone of any content based image retrieval system. If features which index the image correctly are extracted properly; the whole system performs efficiently and accurately. The following sub-sections describe these steps at length.
PPT Slide
Lager Image
General Flow of CBIR
- 3.1 Image Segmentation
Image segmentation is a process of classifying and isolating objects. It is being used as a key operation in a number of applications. In this process, objects of interest are extracted and further tasks are performed on these objects rather than manipulating the whole image. Image segmentation is performed through color image segmentation in Matlab 7.0. An image, with its segmented image, has been shown in Fig. 2 .
PPT Slide
Lager Image
Image Segmentation
- 3.2 Key Point Determination
For the determination of key points, second order partial derivatives are used. These derivatives determine points where slopes are different from the surrounding points.
Definition : An image is treated as function f in terms of equation y = f ( x ). On the graph of y = f ( x ) let P ( x , y ) and Q ( x + Δ x , y + Δ y ) be distinct points near to each other. If θ is the angle that the secant line PQ makes with the x-axis then:
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
As Δx approaches 0 , the point Q moving along the graph of y = f ( x ) approaches P , the chord PQ approaches the tangent line PT in its limiting position and measure θ of angle MAQ approaches Ψ=m∠OTP hence taking limits as Δ x →0, so equation (3) becomes:
PPT Slide
Lager Image
i.e., the derivative of function f at point P represents the slope of tangent to the curve y = f ( x ) at that point as shown in Fig. 3 .
PPT Slide
Lager Image
Slop Change Measurement
The derivative
PPT Slide
Lager Image
of f ( x ) may also create derivative on [ a,b ]. By applying the definition of derivative to f’ (x) , the resulting limit is called the second derivative of y = f ( x ) and is denoted by:
PPT Slide
Lager Image
PPT Slide
Lager Image
Fig. 4 shows an image with key points.
PPT Slide
Lager Image
Image with Key Points
- 3.3 Curve Fitting
For curve fitting, least square polynomial method has been used due to its simplicity and accuracy in estimation. To make the process simpler, only three key points are selected from a set of key points
PPT Slide
Lager Image
Objects from Fig. 4 have been shown with key points in Fig. 5 .
PPT Slide
Lager Image
Objects with Key Points from Image shown in Fig. 4
The general equation of least square polynomial for a given set of points
PPT Slide
Lager Image
is given by:
PPT Slide
Lager Image
To estimate the constants c 0 , c 1 , c 2 , the following equations are formed:
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
In matrix notation the above equations can be summarized as:
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
Solving the above equation gives three values for c 0 , c 1 , c 2 and by substituting these values into the equations (8), (9) and (10); curves are estimated for the given key points. There are hundreds of curves but for simplicity and ease, only curves and lines shown in Table 1 have been deduced. The estimated points for given key points are then plotted on the graph and the curves are coded under the values shown in Table 1 .
Coding of Estimated Shapes
PPT Slide
Lager Image
Coding of Estimated Shapes
To clarify this method the objects in Fig. 5 have been coded and shown in Fig. 6 . In this way all images are processed and populated into the database. Following facts have been used in order to characterize the shapes.
When y is constant; straight line is parallel to x - axis
When x is constant; straight line is parallel to y - axis
When straight line contains both x and y values which are changing, it is at some angle with x - axis .
If a curve has constant values of y and smaller values of x than these in its starting and ending point, the curve is faced up.
If a curve has constant values of y and values of x greater than these in its starting and ending point, the curve is faced down.
If a curve has constant values of x values of y smaller than these in its starting and ending point, the curve is faced right.
If a curve has constant values of x and values of y greater than these in its starting and ending point, the curve is faced left.
The next section describes how color features are estimated to ensure accurate image matching and retrieval.
PPT Slide
Lager Image
Objects according to Defined Shapes
- 3.4 Color Features Estimation
Generally color features are said to be high level features as they are nearer to user perception and vision system while shape features are considered low level features. Here color features are taken as secondary level features. Color features are estimated after processing shape features for every resident image. Selection of retrieved images is based on shape feature descriptor discussed above and then sorted with respect to color values in order to serve nearer retrieval results to user according to the query image. For color feature estimation, we start from top left corner of image and proceed from left to right and top to bottom pixel by pixel, selecting 3×3 neighborhoods for each pixel under consideration and located at center. From the neighborhood, mean value of maximum and minimum values are obtained by virtue of the following expressions:
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
where μn shows the mean value for maximum and minimum values of neighborhood. After scanning the whole image and obtaining mean values of maximum and minimum values for each pixel of its neighborhood, a single mean value is obtained according to the following expression:
PPT Slide
Lager Image
These color features take color distribution into account in the whole image.
- 3.5 Relevance Feedback
In real world implementation, one observes that giving questionnaires or taking opinions from the users in a web application is often bothersome for the users. Usually the user ignores such questions and even in case where they are inclined to answer, them they do not feel comfortable with such applications. Keeping this in mind, a new scheme for recording user responses has been introduced in this paper through which, whenever a user clicks an image from the retrieved images, a record is maintained automatically for the query and the selected images. Based on these records, the system can be retuned according to user experiences. Moreover, when a user clicks an image, it is considered to be positive image example; all images from this category are presented to the user.
- 3.6 Image Indexing
Image indexing refers to the process of searching and storing images into the database in such an effective way that it makes the matching and retrieval process fast and easy. In the field of content based image retrieval, images are stored in the form of features which are based on digits. A number of indexing schemes have been introduced in literature to make this process fast and efficient. In the proposed work, the features have been stored in the form of vectors. A sample features storage data file has been shown in Table 2 .
Data File for Tracking the Image Information
PPT Slide
Lager Image
Data File for Tracking the Image Information
As shown in Table 2 above, images have been stored with their respective objects, the number of key points they contain and the codes for objects. In another data file, information about images and the number of objects they contain have been tracked and shown in Table 3 . To cope with geometric transformation a 3-bit shift code has also been created and saved in a data file as shown in Table 4 .
Data File to count the Objects
PPT Slide
Lager Image
Data File to count the Objects
Data File to contain 3-bit Shift code
PPT Slide
Lager Image
Data File to contain 3-bit Shift code
- 3.7 Similarity Measure
When a query image is presented to the system, image segmentation is performed to get objects from the image. On these objects, key points are generated with the help of second derivative. From these key points, curve estimation is done with the method of least square polynomial and curve estimation objects are assigned respective codes as explained above. These codes are converted into 3-bit left shift codes. Color information is also obtained from the query image. Hamming distance is then used as similarity measure between the query image and the resident image. Hamming distance is considered most effective for the instant case as it tells about the difference in bits between two strings. Let the query image be denoted by e and resident image be denoted by d . Code is stored in bits form in variables e and d whereas the number of bits in a code depends upon the number of curves or lines in an object. When the string length does not match, it is assumed that objects are entirely different and a measure of the difference is calculated. Hamming distance can be computed by:
PPT Slide
Lager Image
This difference is also calculated from the left shifted bits data file to make sure that a rotated form of object is not missed. Once the difference between objects is calculated, it is summed up for each resident image. For example, if image1 has three objects and query image has two objects, the difference between objects will be calculated one by one as: (img1,obj1-imgq,obj1), (img1,obj2-imgq,obj1), (img1,obj3-imgq,obj1), (img1, obj1-imgq,obj2), (img1,obj2-imgq,obj2) and (img1,obj3-imgq,obj2). This means that the difference calculated for each image will be the number of times getting that image by multiplying the number of objects of both images. Experiments have shown that an image should be dropped if the sum of this difference is more than 30% of the number and if it is less than 40% the image is termed as relevant image. This can be summarized by the following expressions:
PPT Slide
Lager Image
PPT Slide
Lager Image
PPT Slide
Lager Image
- 3.8 Invariance to Geometric Transformations
The proposed method fully copes with the geometric transformations like rotation, scaling and translation. Rotation is avoided by left shift codes and it also ensures the presence of rotated versions of images in the database. Each query image is also compared with the rotated versions to get the matched image. Moreover, a shape has the same slope points despite difference in size. For example, a small circle and a large circle have the same slope points; only the size of curves is different which is of no effect because of similar codes. Translation is coped with image segmentation. Wherever an object in the image lies, it detects the object.
4. Experimental Results and Analysis
In this section the proposed method is tested for its retrieval performance. Different offline experiments are performed to check the strength of the method. System specification, quantification measures, experiments and comparison with some state of art methods are discussed in detail in the following sub-sections.
- 4.1 System Specification
System details have been given in Table 5 . The dataset used is core database [13 , 14] . It contains 10,000 low resolution images of size 128x85 and which have been divided into 100 categories, with 100 images in each category. The images also include rotated images to check the proposed method’s flexibility against geometric transformations. Categories include both single object images and composite images with multiple objects.
System Details
PPT Slide
Lager Image
System Details
- 4.2 Quantification Measures
Widely used and accepted precision and recall rates have been used to evaluate the performance of proposed method. Let the total relevant retrieved images are denoted by O , total images to be retrieved by P and total images of a category by Q . Precision and Recall can be expressed by the following equations:
PPT Slide
Lager Image
PPT Slide
Lager Image
Precision measures the ratio of relevant images to the images intended to be retrieved. For example, system is intended to retrieve 50 images per search click and relevant images in these 50 images. A recall measures the ratio between relevant retrieved images to total images in the category to which a query image belongs, for example 100 images in each category and in relevant retrieved images. P and Q are pre-defined whereas O depends on what the system retrieves at click search.
- 4.3 Retrieval Results
A query image from “food” category is provided from database to the proposed system and top 15 retrieval results are shown in Fig. 7 .
PPT Slide
Lager Image
Retrieval Results: On top Query Image Food, down Top 15 Retrieved Results
One can see from Fig. 7 that how accurately the proposed system retrieves the relevant images. Looking at query image, it is evident that image segmentation boundary extraction produces two objects i.e., inner food that is kept into the plate and the outer circle of plate. Application of the proposed method generates two codes for these objects as well as a color description value. Similarity is then calculated in the manner discussed above as a result of which the images which are closer in color to the query image are sorted at top. Many other round objects exist in the database but we obtain accurately similar objects due to inner object consideration. Therefore, the images with these objects or nearer to these have been selected. In the next experiment an image outside of database will be provided as query image.
A query image of “car” outside the database has been provided to the proposed system and top 15 retrieval results have been shown in Fig. 8 .
PPT Slide
Lager Image
Retrieval Results: On top Query Image of “Car”, down Top 15 Retrieved Results
Image is segmented into six different objects and boundaries of these objects are obtained. It can be observed that three objects contain lines, two objects contain curves and one object contains combination of both. These objects are coded according to the proposed method discussed above and compared with already populated values in the database and it is shown how effectively similar images are picked as retrieved images.
One image from each category of Butterfly, Food, Cars, People, Shapes, Sunset, Ocean and Desert is provided as query image and values of precision and recall rates are recorded.
P = images tobe retrieved = 50
Q = tota lim age sin a category = 100
O = Re levant Re trieve dim ages
Precision and Recall
PPT Slide
Lager Image
Precision and Recall
Table 6 shows the precision and recall rates for the proposed method which clearly reflects the effectiveness and accuracy of image matching. Experiments have confirmed that images with fewer objects are more accurately matched and retrieved because images with many objects are capable of having more chances to be matched with other images of the same objects. It can also be seen from Table 6 that shapes with fewer objects have 100% precision rate. In the next, experiment, a comparison is made between the proposed method and some commonly used methods.
Comparison of content based image retrieval needs same computing environment and database because comparing the two methods using different environments and database will give meaningless results. Therefore in this experiment some retrieval methods are implemented in the same computing environment and values of average precision are recorded in order to compare the proposed method with them. These methods include color histogram (color feature), color co-occurrence matrix (CCM), HSI color histogram, Color + texture (HSI color histogram, CCM), Color + Texture + Shape and Color + Texture+ Shape + Relevance Feedback. Table 7 and Fig. 9 show the comparison of results numerically and visually.
Average Precision Comparison of Different Methods
PPT Slide
Lager Image
Average Precision Comparison of Different Methods
PPT Slide
Lager Image
Comparison of Average Precision
It is clear from Table 7 and Fig. 9 that our proposed method achieves the highest average precision by using combined features of shape, color and relevance feedback. This is due to the new and self-introduced features. Shape feature has been used as a primary feature and the whole retrieval performance relies on it. Slope points are detected effectively by using second derivative. Least square polynomial estimates the curves effectively. Moreover, we have introduced a generic method of coding the estimated curves or lines which is very useful in getting accurate results. For user perceptual satisfaction, color features have been extracted to provide the nearest results in color as well. Relevance Feedback has been used in a different way. User responses have been recorded without bothering the user. When user clicks an image, all images belonging to that category are presented to the user automatically. This is helpful in getting recall rates up to 100%.
5. Conclusion
A stand alone image feature results in inaccurate and irrelevant retrieval results. Percentage of system performance increases and better results are achieved when two or more features are combined. Our proposed technique combined self-defined multiple features; shape, color and relevance feedback. To prove the strength of proposed method, other combinations of features were also tested for the same database and same environment as shown in Table 7 . These combinations included color, texture, color + texture, color + texture + shape, color + texture + shape + relevance feedback respectively by using popular methods of color histogram and HSI color histogram for color description, edge histogram descriptor for shape, CCM for texture description and relevance feedback. The proposed method combines shape, color and relevance feedback but its uniqueness rests in the fact that it uses the newly defined methods for shape, color and relevance feedback descriptions that follow second derivative for slop points estimation, least square polynomial for curve estimation, coding mechanism, max-min average for color description and a modified relevance feedback. This mechanism performs the retrieval process very accurately and effectively. As the experimental results show, the proposed method achieves the highest percentage for precision and retrieval accuracy as compared to the other similar techniques currently in vogue.
BIO
Mussarat Yasmin
Assistant Professor
PhD Student, COMSATS Institute of Information Technology Pakistan
MS (SE), IQRA University Karachi
M.Sc. (CS), University of Peshawar
Research Interests
Image Processing, Machine Learning, Artificial Intelligence, Neural Networks, Operating Systems
Dr. Muhammad Sharif
Assistant Professor
COMSATS Institute of Information Technology Pakistan
PhD, CIIT Islamabad
M.Sc. (CS), QAU Islamabad
MS (CS), CIIT Islamabad
Research Interests
Image Processing, Neural Networks, Operating Systems, Computer Graphics, Design of Algorithms
Prof. Dr. Sajjad Mohsin
Dean / Professor, Faculty of IS&T, CIIT Islamabad, Pakistan
PhD, MIT Japan
ME, MIT Japan
M.Sc. (CS), QAU Islamabad
Research Interests
Neural Networks,, Genetic Algorithm, Fuzzy Modeling, Expert Systems, Artificial Intelligence
Isma Irum
MS (CS), CIIT Wah
M. Sc. (CS), University of Wah
Research Interests
Image Enhancement, Medical Images Analysis, Feature Extraction and Image Retrieval
References
Yasmin M. , Mohsin S. 2012 “Image Retrieval by Shape and Color Contents and Relevance Feedback” in Proc. of 10th International conference on Frontiers of Information Technology December 17-19 Article (CrossRef Link) 282 - 287
Yasmin Mussarat , Mohsin Sajjad , Irum Isma , Sharif Muhammad 2013 “Content Based Image Retrieval by Shape, Color and Relevance Feedback” Life Sciences Journal Article (CrossRef Link) 10 (4) 593 - 598
Yasmin Mussarat , Sharif Muhammad , Irum Isma , Mohsin Sajjad 2013 “Powerful Descriptor for Image Retrieval Based on Angle Edge and Histograms” Journal of Applied Research and Technology Article (CrossRef Link) 11 (6) 727 - 732
Jeong Seyoon , Kim Kyuheon , Chun Byungtae , Lee Jaeyeon , Bae Youglae 1999 “An Effective Method for Combining Multiples Features of Image Retrieval” in Proc. of Tenchon Proceding of the IEEE Region 10 Conference September 15-17 Article (CrossRef Link) 982 - 985
Yumin Tian , Lixia Mei 2003 “Image Retrieval based on Multiple Features using Wavelet” in Proc. of Fifth International Conference on Computational Intelligence and Multimedia Applications September 27-30 2003, Article (CrossRef Link) 137 - 142
Ooi Beng Chin , Shen Heng Tao , Xia Chenyi 2003 “Towards Efficient Image Retrieval based on Multiple Features” in Proc. of the Joint Conference of the Fourth International conference on Information, Communication and Signal Processing December 15-18 Article (CrossRef Link) 180 - 185
Xiangyang Wang , Fengli Hu , Hongying Yang 2006 “A Novel Region-of-Interest based Image Retrieval using Multiple Features” in Proc. of 12th Internation Multi-Media Modelling Conference Article (CrossRef Link)
Ha Jeong-Yo , Kim Gye-Young , Choi Hyung-Il 2008 “The Content based Image Retrieval Method using Multiple Features” in Proc. of Fourth International Conference on Networked Computing and Advanced Information Management September 2-4 Article (CrossRef Link) 652 - 657
Jyothi B. , Madhave Latha Y. , Reddy V.S.K. 2010 “Relevance Feedback Content based Image Retrieval using Multiple Features” in Proc. of IEEE International Conference on Computational Intelligence and Computing Research December 28-29 Article (CrossRef Link) 1 - 5
Priya R. 2011 Dr. Vasantha Kalyani David, Optimized Content based Image Retrieval System based on Multiple Feature Fusion Algorithm” International Journal of Computer Applications Article (CrossRef Link) 31 (8)
Yu Jun , Liu Dongquan , Tao Dacheng , Seah Hock Soon 2012 “On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis” IEEE Transactions on System, Man and Cybernetics-Part B: Cybernetics Article (CrossRef Link) 42 (5) 1413 - 1427    DOI : 10.1109/TSMCB.2012.2192108
Zhang Qianni , Izquierdo Ebroul 2013 “Histology Image Retrieval in Optimized Multifeature Spaces” IEEE Journal of Biomedical and Health Informatics Article (CrossRef Link) 17 (1) 240 - 249    DOI : 10.1109/TITB.2012.2227270
Li Jia , Wang James Z. 2003 “Automatic linguistic indexing of pictures by a statistical modeling approach” IEEE Transactions on Pattern Analysis and Machine Intelligence Article (CrossRef Link) 25 (9) 1075 - 1088    DOI : 10.1109/TPAMI.2003.1227984
Wang James Z. , Li Jia , Wiederhold Gio 2001 “SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries” IEEE Trans. on Pattern Analysis and Machine Intelligence Article (CrossRef Link) 23 (9) 947 - 963    DOI : 10.1109/34.955109