Article Preview
TopIntroduction
Human stampedes are considered as one of the most feared crowd disasters. These stampedes occur in highly crowded places such as religious pilgrimages, musical events, professional sporting games and many more. Further, the stampedes can occur due to abnormal events such as fire or explosions. Recently, a study (Rodrigues, 2016) reported the mortality in mass gatherings which shows that between 1980 and 2012, around 350 human stampedes have been reported, causing 10,000 death and 22,000 injuries, approximately.
During the last decade, several human stampedes have occurred causing deaths and injuries to agglomerated humans. Recently, automatic crowd analysis has gained attraction from the research community to reduce crowd disasters and mishaps (Kang, Ma & Chan, 2018). Crowd behavior analysis include applications such as detection of escape behavior, panic events in crowded scenes due to natural disaster, chaotic acts, violent activities, traffic management, and surveillance (Kang et al., 2018). In this context of crowd analysis, computer vision-based system (Yogameena & Nagananthini, 2017) has widely been adopted where crowded scene images and videos are analyzed automatically. Recently, pedestrian tracking (Shen, Sui, Pan, & Tao, 2016) and crowd behavior analysis (Marsden, McGuinness, Little, & O'Connor, 2016) have been introduced to improve the surveillance systems.
In past individual activity analysis has widely been studied and currently, several promising solutions using deep learning concepts are available in literature. Some of these methods for activity recognition are (Wei, Jafari, & Kehtarnavaz, 2019), multi-stream CNN (Tu et al., 2018), deep convolutional networks (Kamel et al., 2018) and many more (Kong & Fu 2018; Qiu, Sun, Guo, Wang, & Zhang, 2019). Despite of significant techniques of individual activity analysis, crowd activity analysis has always been a challenging act, because in crowd scenes, many peoples are located at different positions and move in different directions. Whenever, any abnormal activity triggers, the people move abruptly in all possible directions which creates complexity to analyze the movement. The appropriate movement of crowd will help in detecting the possible anomaly that can alert the security system to take the suitable decision to handle it. Generally, the anomaly detection is categorized as local and global anomaly. According to local anomaly, the behavior of individual differs from the other individual present at the crowded scene. On the other hand, the global anomaly considers the behavior of group in the crowded scene. Figure 1 shows the representation of local and global anomaly detection. In figure 1(a), which represents the local anomaly, movement of one person is studied whereas in figure 1(b) movement of a group of persons is studied for identification of global anomaly.
Figure 1. (a) Local crowd anomaly (b) Global crowd anomaly
Another way of categorizing the crowd behavior is macroscopic and microscopic approach (Zhang, Zhang, Hu, Guo, & Yu, 2018). The macroscopic approaches model the entire crowd as a single entity where each pixel of frame is considered as particles and particle features are extracted and modelled to analyze the crowd behavior such as fluid-dynamics models and network-based models. One such macroscopic approach using swarm particle based optimization has been suggested in (Qasim, & Bhatti, 2019). On the other hand, the microscopic techniques consider crowd as collection of numerous individuals. According to these techniques, each individual is detected and tracked. However, these techniques are suitable to handle the small-scale crowd because in highly dense scenarios, these methods fail to detect and track the individuals due to occlusion and spatio-temporal complexities of video sequences.