In wireless camera networks, the communication load between cameras is a major concern for visual tracking. To save the bandwidth, traditional applications transfer the spatial coordinates under the precondition of camera calibration, which is computationally unreasonable for large and mobile camera networks. In this chapter, we exploit the use of distinctive and fast to compute local features to represent the non-rigid targets. Transmission of feature descriptors between cameras is done without any calibration. Combining the haar-like patterns and relative color information, our local features succeed to re-identify and relocate the target among the distributed cameras. Furthermore, efficient interest point detection and matching scheme are proposed for the visual tracking under real-time constraints.
TopIntroduction
In this chapter, the camera network objective is to online detect and track a target moving in hostile environments. The ability of the system to detect the target state, in an automatic way, is very important for its efficient functioning or for security purposes. In fact, according to each state, the system should adopt a specific behavior. One can also mention the use of camera networks for the monitoring of production systems in order to face the industrial risks, the monitoring of the houses for safety or the house automation, the air and transport control in general, the agriculture of precision or intelligent alarms for the prevention of the natural disasters. With such systems, the automatic control of an event or an incident rests on the reliability of the network for an efficient and robust decision-making.
Collaborative information processing in the camera networks is becoming a very attractive field of research. In such a camera network, the imaging sensors role is not limited to detect and transmit the data to a central unit where they are processed. Individual cameras have the capability to process the raw data and transmit only abstracted information to a fusion unit. The sensors have the ability to collaborate, exchange information to ensure optimal decision. Such imaging sensors are called “smart” cameras. Contrary to the centralized approach, the system does not depend on a unique processing unit which damaging leads to the entire system failure (see Figure 1). Every smart sensor is able to play a central role and provide a suboptimal decision. The system is thus very robust against a probable foreign attack or a technical failure of the central unit. In addition, as collected data are locally processed, only relevant information is exchanged between smart nodes, limiting hence the required channel communication bandwidth. In fact, in a centralized network, all sensors transmit raw data to a unique processing unit, increasing the required communication bandwidth.
Figure 1. Distributed wireless embedded smart cameras
For visual tracking in wireless camera networks, the local image processing is highly affected by the limited processing unit. Therefore, the design of a wireless camera network should pay more attention to saving the overheads of local computation. Visual tracking usually consists of two steps: the first step is representing the target as a reference model in the form of feature descriptors; while the second step is inferring the best location by comparison between the reference model and the current frame. However, in the location procedure, it is challenging to prove whether a candidate box involves the true target or not. This can be simply explained by the unstable appearance of the target across time. Typically, in human tracking case, the non-rigid object model (size, shape, color, etc.), irregular movement and frequent occlusion require more robustness of the tracker. Probabilistic methods have been introduced to infer the location under uncertain and complex situations. Particle filtering (Doucet, Godsill, & Andrieu, 2000; Isard & Blake, 1998; Pérez, Vermaak, & Blake, 2004) is widely used in the literature. However, it has been shown that the particle filter might fail in the high dimension situation and also that the number of particles is critical to its performance. If the tracking environment is difficult, it would sacrifice more particles to track the target position. Variational inference (Vermaak, Lawrence, & Pérez, 2003) is an interesting alternative to the particle filter tracking. However, the variational inference still uses the Monte Carlo sampling to compute the likelihood function. The performance of the above-mentioned two Bayesian inference algorithms highly depends on the distinctive and consistent reference model. In fact, if the true target’s feature model varies across the scene, no inference tools would function well by only using the initialized reference model.