Article Preview
TopIntroduction
We are in a world of information because we believe that it leads to the power and success. Through the development of technology, the amount of information available has been accumulating continually and exponentially. Recently, in our modern digital world the problems of relevance and information overload, have become a topical issues in the middle of scientific community, and especially in the domain of electronically imagery (photos are constantly taken, shared, watched and retrieved). This type of data was rapidly spreading out with the growth of digital photography, and mobile phones.
The latest statistics, have shown that the number of images on the internet is estimated more than 15 billion (especially the human images). For these reasons, the automatic classification of this gigantic image database, has become a substantial problem in the environment of computer science.
For instance, we are faced to thousands of images contained a peoples with different gestures. We ask a human to classify these images into a groups, where the images belonging to the same group should hold back people with the same gesture, and images of different groups must contain image of humans with different motions as shown in figure 1. This is the virtual image of our work on the machine, in order to help search engines to better respond to the demand of users. For example, typed a query on Google: “Zidane hits the ball”, we want only Zidane images with gesture hit the ball. The results obtained are depicted in figure 2.
Figure 1. Images with different gesture (Singh, 2010)
Figure 2. Result of Google for the query “zidane hit the ball”
We had obtained only five pictures of Zidane that hits the ball (red rectangle) considered as relevant among the 20 first images returned by the Google search. Consequently, if Google works with the principle of human images gestures clustering, instead of searching in all the pictures, it will just search in the cluster of images with human gesture “hit the ball” and the results will be more pertinent, it is the context of our work.
For this reason, we need to develop tools that helps us to find within a reasonable time the desired images, and ensures the extraction of knowledge from a set of images like the supervised images classification, which predicts if an image is a member of a predefined class or note (Tan, 1999). This technique presents a real limits and a set of disadvantages as:
- •
Supervised classification requires more resources (intervention of a human expert and external information).
- •
The number of classes is not always recognized in advance.
- •
The choice of the training set.
- •
Little consistent classification.
View the limits of the supervised classification. In our study, we are interested by the techniques of clustering (automatic classification) that does not need the presence of a supervisor and the data used contains only the input. These techniques produce an implicit model for grouping a set of images into a set of clusters, in order to minimize the intra-class inertia, and maximize the inter-class inertia. However, even the classical clustering methods suffer from several problems and limitations:
- •
Images representation and indexation.
- •
The choice of similarity measure.
- •
Execution time.
- •
The number of initial clusters.