Article Preview
TopIn surveillance, we describe an event in six facets, namely, What, When, Who, Why, Where and How (5W1H) that could be generalized to feature any surveillance events (Westermann & Jain, 2007). Aligned with studies in video mining and video retrieval (Dai, Zhang, & Li, 2006) (Geetha & Narayanan, 2010), events are regarded to consist of these six 5W1H major components in event recognition and modelling (Xie, Sundaram, & Campbell, 2008). In computing, visual event represents an action or occurrence that could be quantified and recognised by (computing) machine. Similarly, the definition of an event in this paper is the occurrence of something at a particular time and at specific location. In order to facilitate a computer to record, index and arrange video events for users’ post-analysis, the events have a number of attributes including ID, time, location and description. According to the attributes, an event is detected and classified into different classes from the videos in surveillance. Based on categories of an event, we can group the detected events as normal and abnormal ones. For example, Figure 1 shows a normal and abnormal event. Normally, a pedestrian should walk in standing position as in Figure 1(a) or when the walker/bystander falls down as in Figure 1(b) the abnormal event should be detected, and a surveillance alarm should be generated correspondingly and automatically.