Object Retrieval Using the Quad-Tree Decomposition

Saliha Aouat; Slimane Larabi

doi:10.1515/jisys-2013-0014

Article Open Access

Object Retrieval Using the Quad-Tree Decomposition

Saliha Aouat and Slimane Larabi

Published/Copyright: July 23, 2013

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Journal of Intelligent Systems Volume 23 Issue 1

Abstract

We propose in this article an indexing and retrieval approach applied on outline shapes. Models of objects are stored in a database using the textual descriptors of their silhouettes. We extract from the textual description a set of efficient similarity measures to index the silhouettes. The extracted features are the geometric quasi-invariants that vary slightly with the small change in the viewpoint. We use a textual description and quasi-invariant features to minimize the storage space and to achieve an efficient indexing process. We also use the quad-tree structure to improve processing time during indexing. Using both geometric features and quad-tree decomposition facilitates recognition and retrieval processes. Our approach is applied on the outline shapes of three-dimensional objects. Experiments conducted on two well-known databases show the efficiency of our method in real-world applications, especially for image indexing and retrieval.

1 Introduction

Geometry-based methods are feature-based methods that extract points from the image (usually edge or corner points) and reduce the problem to point set matching. Users are more interested in matching and retrieval by shape than by color and texture [3–5, 8, 12, 14, 16, 17, 19, 27, 28, 31, 32]. Although color and texture are important characteristics, the shape remains the most used characteristic because it allows to distinguish between objects more than other features [25, 26, 30].

Segments are particularly interesting features [9, 10, 15] because of their robustness to noise and their connectedness constraint that reduces the possibility of false matches. They can vary slightly with a small change in the viewpoint, but they can also be invariant under similarity transform of the image [6].

Because these features are widely used to match objects, we need to use geometric-invariant features. Thus, we used two-dimensional (2D) image features – the intersecting segments – and transformed them to pairs of quasi-invariants features.

Our aim was to develop a complete three-dimensional (3D) object recognition system based on (i) the extraction of pairs of quasi-invariants from the textual description of shapes and (ii) the matching of geometric quasi-invariants between the query and the models. We also used the quad-tree structure to improve the processing time of the recognition process. This ensures that recognition and indexing are restricted to 2D–2D matching. Instead of directly interpreting the 3D object information, we store several 2D features (quasi-invariants) of a 3D object and perform object retrieval in the representation space of the 2D indexes.

Indeed, although we can recognize 3D objects, we deal with 2D views because our approach is based on appearance. In appearance-based methods, instead of dealing with a 3D model of an object, we deal with many 2D views of the same object. Thus, when a given query (which is a 2D image) is matched with one of the 2D views of a 3D object, that 3D object is therefore recognized.

Quad trees are very popular and are extensively used in computer graphics and computer vision [7, 20, 23, 29] because they can be manipulated and accessed more quickly than other models. Recursive pictures can be easily implemented using quad trees. Other advantages of quad trees include the following [21, 22, 24]:

Erasing a picture takes only one step, i.e., setting the root node to neutral.
Zooming to a particular quadrant in the tree is a one-step operation as well.
Reducing the complexity of the image also involves a single step, i.e., removing the final level of the nodes.
Accessing particular regions of the image is a very fast operation. This is useful for updating certain regions of an image.

The only drawback of quad trees is that they take up a lot of space. If a quad tree is implemented using links, most of the memory will be taken up by the links. Nevertheless, there are ways of compacting quad trees, which is important for transferring data efficiently.

To overcome such drawback, two levels are sufficient to index all the images of the database (see Section 4), as we have demonstrated in a previous work.

The article is structured as follows: in Section 2, we present the geometric quasi-invariants. The textual description of shapes is detailed in Section 3. In Section 4, we show the decomposition of shapes following the quad-tree structure. Section 5 focuses on our approach to recognize and retrieve the best model for the given query. In Section 6, we use real images of two well-known databases and discuss the obtained results after applying our approach.

2 Geometric Quasi-Invariants

There are a large number of contributions for the retrieval problem with geometric features [25, 26, 30]. Segments are particularly interesting features because of their robustness to noise and their connectedness constraint. They reduce the possibility of false matches because they are based on a topological reality in the image. They can vary slightly with a small change in the viewpoint, but they can also remain invariant under similarity transform of the image [6].

Because these features are used to match objects, we need to use geometric-invariant features. We therefore consider the intersecting segments as 2D image features and transform them to pairs of quasi-invariants features (ρ and θ).

The quasi-invariants (ρ and θ) are defined as the angle θ between the intersecting segments and the segments length ratio ρ. The (ρ and θ) pairs extracted from each image (see Figure 1) vary slightly with a small change in the viewpoint. They are invariant under similarity transform of the image [6, 10]. In our approach, the (ρ and θ) pairs will be computed from the textual description of shapes.

Figure 1

Geometric Quasi-Invariants (ρ and θ).

3 Textual Description of Outline Shapes

The part-based method builds shape descriptors using a minimum rectangle (MR) that encloses the outline shape [2, 11]; (OXY) is the referential attached to the MR chosen, such that the origin O is the left top edge of MR (see Figure 2).

Figure 2

Initial Silhouette, the Minimum Rectangle, and the Rotated Silhouette.

From this geometric description, the outline shape may be drawn without ambiguity, implying the propriety of uniqueness and preservation of perceptual structure. The invariance of this description to rotation is guaranteed by the sweep-up of the silhouette following one of the directions of the MR encompassing the silhouette [2, 11].

The first step of the shape description is its decomposition into parts and separating lines. For this, we sweep the outline shape from top to bottom following the horizontal direction of the (MR) and locate for concave points for which the outer contour changes the direction top-bottom-top or bottom-top-bottom (see Figure 3).

Figure 3

Location of Decomposition Points.

The outline shape is decomposed at these points into parts and separating lines: two or more parts are joined with a third part through a junction line. One part is joined with two or more parts through a disjunction line. This process applied, for example, to the silhouette of Figure 3 produces five parts and two junction lines (see Figure 4). Parts and separating lines are numbered from top to bottom and left to right.

Figure 4

Parts, Junction, and Disjunction Lines.

3.1 Global Textual Descriptor

The global descriptor of a silhouette is an XML descriptor for which only parts, composed parts, junction, and disjunction lines without their geometric descriptions appear. For example, the global descriptor of the silhouette of Figure 4 is <CP><CP>P1 P2 JL1 P4</CP>P3 JL2 P5</CP>, where <CP> and </CP> are marks for the composed part, P_i designates the i^th part, JL_k designates the k^th junction line, and DJL_n designates the n^th disjunction line. Different silhouettes may have the same global descriptor.

3.2 Full Textual Descriptor

The second step is the description of each element (each part and line) to guarantee the uniqueness of the outline shape representation.

A full textual descriptor is an XML descriptor based on the geometry of silhouette features defined below [2, 11]:

A part is described by its left and right boundaries.
A boundary is a succession of primitives.
A primitive may have one of the three following shapes: convex curve (Cv), concave curve (Cc), or a straight line segment (Ln).
A primitive is characterized by its inclination angle and length.
The length of a primitive refers to the high of the primitive, except when the primitive is horizontal (see Figure 5).
A curve is approximated by two segments. For example, the curve (ab) in Figure 6 is approximated by two segments [ac] and [bc], where cd is the biggest distance between the curve and the segment [ab].

Figure 5

Full Descriptor of a Part.

Figure 6

Convexity and Concavity Degrees.

For example, the description of part 2 of Figure 4 (see Figure 5) is <P2><L>r 43 110</L><R>r 180 23 CV 9 121 110</R></P2>. The left boundary (designated by L) of part 2 (see Figure 5) is composed of a line segment [a,b] with α=43° of inclination and length=110 pixels. The right boundary (designated by R) is composed of a line segment [b,c] with α=180° of inclination and length=23 pixels followed by a convex contour [c,d] with degree of convexity=0.09 (9%), β=121° of inclination, and length=110 pixels (“CV” is used to indicate a convex curve and “CC” for a concave curve).

A contour of a shape is segmented into a set of elementary contours (primitives) by determining the curvature points (inflexion points) located on the contour [2, 11].

We distinguish between the parts using the separating lines. As shown in Figure 4, P1 and P2 are separated from P1 by the junction line JL1. P4 and P3 are separated from P5 by the junction line JL2.

These separating lines are decomposed into segments. Each segment is described with three parameters: type, reference numbers of the linked parts, and length.

Three types of segments are possible: shared (designated by S), if the segment is common for two parts; free-high (designated by H), if its neighbor is the high part; free-low (designated by W), if its neighbor is the low part (see Figure 7).

Figure 7

Description of Parts and Separating Lines.

For example, the junction line JL1 description is <J1>S P1 P4 120 W P1 120 S P2 P4 120</J1>. This line is decomposed into three segments:

The first segment is shared between parts P1 and P4; its length is 120 pixels.
The second segment is free low and adjacent to part P4; its length is 120 pixels.
The last segment is shared between P2 and P4; its length is 120 pixels.

To obtain the full descriptor, we replace in the global descriptor each part and separating line by its description. For example, the full textual descriptor of the shape in Figure 7 is

<P5><L>r 90 100 r 180 660</L><R>r 90 100</R></P5>.

4 Decomposition of Shapes Following the Quad-Tree Structure

When decomposing the image following the quad-tree structure, we divide the picture area into four sections. Those four sections are then further divided into four subsections. We continue this process, repeatedly dividing a square region by four (see Figure 8).

Figure 8

The Quad-Tree Structure.

Thus, in terms of a quad tree, the children of a node represent the four quadrants. The root of the tree is the entire picture.

To represent a picture using a quad tree, each leaf must represent a uniform area of the picture. If the picture is black and white, we only need one bit to represent the color in each leaf, for example, 0 could mean black and 1 could mean white.

Figure 9 shows a dichotomy by blocks of an image. Initially, there is one block that represents the entire image. If the block (the quadrant) is homogeneous (contains similar pixels relatively to color, texture, or other properties), then we stop the decomposition; otherwise, the block will be decomposed into four subblocks recursively until the homogeneity of small blocks (small quadrants) is obtained.

Figure 9

An Example of a Quad-Tree Hierarchy.

Figure 10, for example, shows a recursive decomposition of a quad tree according to quadrant homogeneity. When the quadrant is entirely white (outside the shape) or entirely black (inside the shape), the decomposition of the quadrant will be stopped.

Figure 10

Image Decomposition Following the Quad-Tree Structure.

We have demonstrated in a previous work [1] that two levels of the quad tree (16 quadrants) are sufficient to reduce processing time and discard many models of the database that are far from the query. Thus, after this step, only few models (for a given query) are maintained. This step is very important especially for large image databases.

5 Image Retrieval and Recognition

Our recognition process aims to retrieve and recognize the best model for the given query. It is an appearance-based method in which 3D objects are represented by multiple views. The recognition is simplified by performing 2D matching and retrieval between the query and all 2D models of all objects in the database. Finally, the best model for the given query will be determined after applying similarity measures.

It is important to note that in appearance-based methods, the query is theoretically situated between two models of the database (except when the query is a view of an object that is not in the database).

Each step of our approach requires the determination of intervals, or thresholds, to compare the query with the models. The determination of thresholds must be done in an offline process using hundreds of nearby 2D models of many objects. To retrieve the best model, in an online process, the query and the models are compared. If the distance between a model and a given query is outside such intervals, the model will be discarded. In contrast, if many models verify the threshold values, we sort those models, following an efficient similarity measure, to keep the best one that is close to the query.

5.1 Offline Process

We consider hundreds of objects (see Section 6) and study the variation of characteristics and features between nearby images to find the thresholds of the similarity measures used in our approach.

To compute the thresholds, we take into account all pairs of images. Each pair contains nearby models, and then we compute the difference between extracted features between all nearby images.

5.1.1 Variation of the Filling Rate of the Quadrants

We define the filling rate as the percentage of black pixels in a quadrant relative to the number of all pixels in the same quadrant (see Figure 11).

Figure 11

Decomposition of a Silhouette into 16 Quadrants.

The filling rate of black quadrants is 100% because the quadrants are inside the silhouette in this case. However, the filling rate of white quadrants is 0% because the quadrants are outside the silhouette.

The filling rate is given in the following.

The subdivision of the quad tree is recursive:

If the rate is 0%, then the quadrant is entirely white and the quadrant will not be divided into four subsquares. This is the first condition to stop the recursive division.
If the rate is 100%, then the quadrant is entirely black and the quadrant will not be divided into four subsquares. This is the second condition to stop the recursive division.
If the rate is a value between 0% and 100%, then the quadrant contains white and black pixels; in this last case, the quadrant will be divided into four subsquares. We will stop the recursive decomposition when the indexing process is sufficiently consistent and all objects are classified into the appropriate classes.

The algorithm of the subdivision process is given below:

Algorithm Subdivision;

// rate represents the filling rate of the quadrant

BEGIN

For each level I of the quad-tree Do

For each quadrant J of level I DO

If (rate=0% or rate=100%) then stop the subdivision process

Else

Decompose the quadrant into four sub quadrants.

End If

End For;

End For

END.

An example of a decomposition following the filling approach was given in Figure 9.

Evaluations (see the Experimental section) show that the intervals of the filling rate to be maintained for the online process is [0,10]. We have chosen 16 quadrants (two levels of the quad tree) because, as demonstrated in [1], this reduces the processing time and allows to discard the models that are far from the query.

For each quadrant I of the query, the same quadrant I of the model must have an almost similar rate as the query; otherwise, such a model will be discarded during the indexing process.

5.1.2 Computation of Quasi-Invariants from the Textual Description

Let us consider the shape in Figure 12. In the textual description, curves are approximated by segments, and all angles are given between the primitive and the horizontal axis [11] (see Section 3).

Figure 12

Determination of Quasi-Invariants from XLDWOS Descriptors.

To find all the angles between two successive segments, we have to deduce them from the textual descriptor.

For example, in Figure 12, angle (a) is given using textual description [11]; it is situated between segment R′ and the horizontal axis. We compute angle (b), which is between two successive segments (one of the quasi-invariants). Indeed, for the quasi-invariants, we need all the angles between the segments and the length ratio of successive segments (see Section 2).

The determined quasi-invariants are used to find the best model for the given query. Indeed, to achieve the recognition process, it is very important to compare the geometry (the quasi-invariants) of the query with the geometry of the models. Let us consider the query silhouette in Figure 13, which is composed of three parts, P1, P2, and P3. The model to be selected will be similar (or almost similar) to the query. Let us consider part 1. We can compute all the angles (α₁ until α₅) from this part as well as the segment ratio (ρ) for all successive segments. The same process is done for all parts.

Figure 13

Application of Quasi-Invariants to the Textual Descriptor.

The computed values for the selected model must be close to those of the query; thus, it is important to determine the similarity intervals (or thresholds) between the quasi-invariants of the two silhouettes to be compared. Intersecting segments are the geometric configurations used to calculate the quasi-invariants parameters for which a set of similarities is defined.

For example, after applying this technique to the models of Figure 14, we deduce that model 1 may be recognized because it is almost similar to the query (ρ1 is almost equal to ρ2). Of course, we assume that all other configurations of quasi-invariants are similar between the query and model 1. Model 2 does not verify the thresholds of the quasi-invariants (ρ1 is very different from ρ3); thus, it will be discarded during the indexing and recognition processes.

Figure 14

Example of the Matching Between the Query and the Models.

5.2 Online Process

In the online study, we perform the retrieval process. For a given query, we have to find the most similar images in the database following three steps.

The first step is to decompose the query into 16 quadrants (see Figure 11) and compare the filling rates with those of the models. All models verifying the thresholds, determined in Section 5.1.1, will be selected and used for the next step.

The second step is to compute the global descriptor of the query. All models with same descriptors as the query will be selected for the next step.

The last step is to compute the quasi-invariants of the query and then select all the models verifying the thresholds determined in the offline study.

If many models are selected after these steps, we keep the model that minimizes the Euclidian distance with the query. This distance is computed for the quasi-invariant pair.

where xr represents the quasi-invariants of the query (angle or segment ratio) and xm represents those of the models.

The algorithm of the retrieval process is given by

Algorithm Retrieval;

// rate represents the filling rate of the quadrant

BEGIN

// decompose the query into 16 quadrants

For Quadrant 1 to Quadrant 16 Do

Compute rate; Number=0; // Number is used to compute number of quadrants verifying the thresholds

If rate verifies the threshold then Number=Number + 1

End If

End For

If Number=16 then

Compute the global descriptor and quasi-invariants.

Select the best model verifying thresholds of quasi-invariants.

Else Failure of the retrieval process

End If

END.

6 Experimentation

Different objects have been used to validate the proposed methods. We use the database of shapes built by Leibe and Schiele [13]. This database contains 80 different objects. Each object is represented by 41 views spaced evenly over the upper viewing hemisphere. Some objects are shown in Figure 15.

Figure 15

A Set of Shapes of the Database of Leibe and Schiele and Some Views of an Object [13].

6.1 Evaluation of the Filling Rate of Quadrants

We consider all successive images of the database. Successive images correspond to pictures taken from nearby points of views of the same object. We compute the filling rate of all quadrants between all nearby models. Each quadrant k of the first model i is compared with the quadrant k of the model i+1 (same positions of compared quadrants). Experiments show that more than 95% of nearby images have a difference of filling rate between 0 and 10 (see Figure 16).

Figure 16

Thresholds of Filling Rate.

The number of configuration in this graph refers to the number of compared quadrants.

6.2 Evaluation of Quasi-Invariants Differences

We extract all quasi-invariants ρ and θ from models, then we compare all those extracted from nearby views. Experiments (see Figure 17) show that more than 90% of the quasi-invariants configurations are less than [ln(ρ)=0.21 and θ=17.8]. [We use ln(ρ) instead of ρ because ln(ρ) follows a uniform distribution [10].]

Figure 17

Difference of Quasi-Invariants Between Successive Images.

6.3 Retrieval Process

The query is processed as explained in Section 5.2: after the three steps of the online process, the best model for the query is retrieved. Three examples are given in Figure 18.

Figure 18

Examples of the Retrieved Shapes: (Left) Best Model and (Right) Query.

Other experimental results are given in Figures 19 and 20; next to the query, we give the best selected model after the retrieval process.

Figure 19

Retrieved Shapes for the Object “Cup” [13]: (Left) Best Model and (Right) Query.

Figure 20

Retrieved Shapes for the Object “Car” [13]: (Left) Best Model and (Right) Query.

We have also tested our method using the Mokhtarian database [18], which contains 1100 images of marine animals with a wide variety of shapes (see Figure 21). All results are good; the most similar model for the query is always retrieved. Some experimental results are given in Figure 22. Next to the query, we give the best model selected after the retrieval process.

Figure 21

Some Objects in the Mokhtarian Database [18].

Figure 22

Examples of Retrieved Shapes in the Mokhtarian Database [18].

The good results obtained are attributed to our method, which takes into account the geometry of the object, and the quality of the images in the Mokhtarian database [18]. All objects have specific shapes, and there is no ambiguity between different shapes. Indeed, all shapes in this database are well represented and not noisy. Applying our approach on this database always gives the best model for the query. The retrieval rate is 100%.

All experiments on more than 4000 images from both the Leibe and Schiele database [13] and the Mokhtarian database [18] give good results except when the view does not correctly represent or identify the object. An example of the failure of the retrieval process is shown in Figure 23, where we can see that the best model retrieved for the tomato is an image of an apple and the best model retrieved for the apple is a pear.

Figure 23

Failure of the Retrieval Process.

Indeed, in the Leibe and Schiele database, some views are ambiguous (see Figure 23). There are also some views of some animals (cow, dog, horse) that we cannot easily distinguish. The retrieval process fails in these cases. As the number of ambiguous images does not exceed 100 in the whole database (among 4000 images), the filling rate of the retrieval process does not exceed 2.5%.

The failure is due to the view itself, which does not correctly differentiate each object from the others. The human vision, which is the most sophisticated visual system, may find it difficult to recognize that the objects in Figure 23 are tomatoes, apples, pears, a ball, etc. This is due to the bad point of view from which some images of those objects are taken.

7 Conclusion

In this article, we proposed a new method for silhouette retrieval. The silhouettes are written following textual descriptors of shapes and are indexed using both the quad-tree structure and geometric quasi-invariants.

The measures used were the following:

The filling rates of the quadrants to reduce the number of models maintained for the next steps: We have used two decompositions levels to reduce the processing time, which leads to a fast indexing process.
The geometric quasi-invariants to efficiently compare the geometry of the query with the geometry of the models. Such descriptors are extracted from the textual representation of the outline shapes.

The experiments, which were performed on two known databases, showed the method’s efficiency and its usefulness in resolving the problem of the retrieval process.

In future works, we will take into account noisy and occluded shapes to achieve the recognition process.

Corresponding author: Saliha Aouat, LRIA Laboratory, Computer Science Department, University of Sciences and Technology Houari Boumediene, BP 32 El Alia Bab Ezzouar, Algiers 16111, Algeria, e-mail: saouat@usthb.dz

Bibliography

[1] S. Aouat and S. Larabi, Indexing binary images using quad-tree decomposition, in: International Conference on System, Man, and Cybernetics, pp. 10–13, Istanbul, Turkey, 2010.10.1109/ICSMC.2010.5641701Search in Google Scholar

[2] S. Aouat and S. Larabi, Matching descriptors of noisy outline shapes, Int. J. Image Graphics10 (2010), 299.10.1142/S0219467810003792Search in Google Scholar

[3] R. Arandjelovic and A. Zisserman, Efficient image retrieval for 3D structures, in: British Machine Vision Conference, BMVC 2010, Aberystwyth, UK, August 31 - September 3, 2010. Proceedings. British Machine Vision Association 2010 ISBN 1-901725-40-5, pp. 1–11, 2010.10.5244/C.24.30Search in Google Scholar

[4] T. M. Cronin, Visualizing concave and convex partitioning of 2D contours, Pattern Recog. Lett.24 (2003), 429–443.10.1016/S0167-8655(02)00267-2Search in Google Scholar

[5] C. M. Cyr and B. B. Kimia, A similarity-based aspect-graph approach to 3D object recognition, Int. J. Comput. Vis.57 (2004), 5–22.10.1023/B:VISI.0000013088.59081.4cSearch in Google Scholar

[6] P. Gros and L. Quan, 3D projective invariant from two images, in: Proceedings of the SPIE Conference on Geometric Methods in Computer Vision II, pp. 75–86, San Diego, CA, July 1993.10.1117/12.146647Search in Google Scholar

[7] G. M. Hunter and K. Steiglitz, Operations on images using quad trees, IEEE Trans. Pattern Anal. Machine Intell.PAMI-1 (1979), 145–154.10.1109/TPAMI.1979.4766900Search in Google Scholar PubMed

[8] D. Keysers, T. Deselaers and T. M. Breuel, Optimal geometric matching for patch-based object detection, Electron. Lett. Comput. Vis. Image Anal.6 (2007), 44–54.10.5565/rev/elcvia.136Search in Google Scholar

[9] Y. Lamdan and H. J. Wolfson, Geometric hashing: A general and efficient model based recognition scheme, in: Second International Conference on Computer Vision, ICCV 1988. Tampa, Florida, USA, 5-8 December, 1988, Proceedings. IEEE 1988 ISBN 0-8186-0883-8, 1988.Search in Google Scholar

[10] B. Lamiroy and P. Gros, Rapid object indexing and recognition using enhanced geometric hashing, in: 4th European Conference on Computer Vision, Cambridge, UK, 1996.10.1007/BFb0015523Search in Google Scholar

[11] S. Larabi and S. Bouagar, An XML language for writing descriptors of silhouettes, in: GVIP ’05 Conference, Cairo, 2005.Search in Google Scholar

[12] L. J. Latecki, R. Lakaemper and D. Wolter, Optimal partial shape similarity, Image Vis. Comput.23 (2005), 227–236.10.1016/j.imavis.2004.06.015Search in Google Scholar

[13] B. Leibe and B. Schiele, Analyzing appearance and contour based methods for object categorization, in: International Conference on Computer Vision and Pattern Recognition, Madison, WI, 2003.Search in Google Scholar

[14] T. Ma and L. J. Latecki, From partial matching through local deformation to robust global shape similarity for object detection, in: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011. IEEE 2011, pp. 1441–1448, 2011.Search in Google Scholar

[15] S. Matusiak, M Daoudi, T. Blu, O Avaro, Sketch-based images database retrieval. in: Advances in Multimedia Information Systems, pp. 185–191. Springer, Berlin Heidelberg. http://scholar.google.fr/citations?view_op=view_citation&hl=fr&user=7UoD6McAAAAJ&citation_for_view=7UoD6McAAAAJ:Y0pCki6q_DkC 1999.10.1007/3-540-49651-3_18Search in Google Scholar

[16] F. Mokhtarian, Silhouette-based isolated object recognition through curvature scale space, IEEE Trans. Pattern Anal. Machine Intell.17 (1995), 539–544.10.1109/34.391387Search in Google Scholar

[17] F. Mokhtarian and A. K. Mackworth, A theory of multiscale, curvature-based shape representation for planar curves, IEEE Trans. Pattern Anal. Machine Intell.14 (1992), 789–805.10.1109/34.149591Search in Google Scholar

[18] F. Mokhtarian and S. Abasi, Shape similarity retrieval under affine transforms, Pattern Recog.35 (2002), 31–41.10.1016/S0031-3203(01)00040-1Search in Google Scholar

[19] J. Philbin, O. Chum, M. Isard, J. Sivic and A. Zisserman, Object retrieval with large vocabularies and fast spatial matching, in: 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 18-23 June 2007, Minneapolis, Minnesota, USA. IEEE Computer Society 2007.10.1109/CVPR.2007.383172Search in Google Scholar

[20] S. Ranade, A. Rosenfeld and H. Samet, Shape approximation using quadtrees, Pattern Recog.15 (1982), 31–40.10.1016/0031-3203(82)90058-9Search in Google Scholar

[21] H. Samet, Region representation: quadtrees from binary arrays, Comput. Graphics Image Process. 13 (1980), 88–93.10.1016/0146-664X(80)90118-5Search in Google Scholar

[22] H. Samet, Applications of spatial data structures: computer graphics, image processing, and GIS, Addison-Wesley, Reading, MA, 1990.Search in Google Scholar

[23] H. Samet and R. E. Webber, Hierarchical data structures and algorithms for computer graphics, Computer Science TR-1752, University of Maryland, College Park, MD, January 1987.Search in Google Scholar

[24] M. Shneier, Calculations of geometric properties using quadtrees, Comput. Graphics Image Process. 16 (1981), 296–302.10.1016/0146-664X(81)90042-3Search in Google Scholar

[25] J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, M. Cook and R. Moore, Real-time human pose recognition in parts from single depth images, Commun. ACM56 (2013), 116–124.10.1145/2398356.2398381Search in Google Scholar

[26] A. Torralba, K. P. Murphy and W. T. Freeman, Using the forest to see the trees: exploiting context for visual object detection and localization, Commun. ACM53 (2012), 107–114.10.1145/1666420.1666446Search in Google Scholar

[27] N. H. Trinh and B. B. Kimia, Skeleton search: category-specific object recognition and segmentation using a skeletal shape model, Int. J. Comput. Vis.94 (2011), 215–240.10.1007/s11263-010-0412-0Search in Google Scholar

[28] X. Wang, X. Bai, W. Liu and L. J. Latecki, Feature context for image classification and object detection, Comput. Vis. Pattern Recog. 2011 (2011), 961–968.Search in Google Scholar

[29] J. R. Woodwark, The explicit quad tree as a structure for computer graphics, Comput. J.25 (1982), 383–390.10.1093/comjnl/25.2.235Search in Google Scholar

[30] C. Wu, Y. Kuo and W. Hsu, Large-scale simultaneous multi-object recognition and localization via bottom up search-based approach, in: MM '12 Proceedings of the 20th ACM international conference on Multimedia Pages 969–972 , ACM New York, NY, USA ©2012.10.1145/2393347.2396359Search in Google Scholar

[31] X. Yang, X. Bai, L. J. Latecki and Z. Tu, Improving shape retrieval by learning graph transduction, ECCV4 (2008), 788–801.10.1007/978-3-540-88693-8_58Search in Google Scholar

[32] N. Zaeri, F. Mokhtarian and A. Cherri, Binarized eigenphases applied to limited memory recognition systems, Pattern Anal. Appl.11 (2008), 373–383.10.1007/s10044-008-0129-7Search in Google Scholar

Received: 2013-1-17

Published Online: 2013-07-23

Published in Print: 2014-01-01

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Articles in the same Issue

https://doi.org/10.1515/jisys-2013-0014

Keywords for this article

Similarity measures; textual descriptor; quad-tree decomposition; object retrieval

Creative Commons

BY-NC-ND 3.0