In the near future, several radio access technologies will coexist in Beyond 3G mobile networks (B3G) and they will be eventually transformed into one seamless global communication infrastructure. Selfmanaging systems (i.e. those that self-configure, self-protect, self-heal and self-optimize) are the solution to tackle the high complexity inherent to these networks. In this context, this chapter proposes a system for automated fault management in the Radio Access Network (RAN) of wireless systems. The chapter presents some basic definitions and describes how fault management is performed in current mobile communication networks. Some methods proposed for auto-diagnosis, which is the most complex task in fault management, are also discussed in this chapter. The presented systems incorporate Key Performance Indicators (KPIs) to identify the cause of the network malfunction.
There is no doubt that during the last decade mobile communications have played an increasingly important role in the telecommunication business, and it will continue to do so in the years to come. In the last years, 3G networks, called Universal Mobile Telecommunications Service (UMTS) networks in Europe, have started to be deployed throughout the world. In the near future, thanks to 3G, mobile internet-services are expected to be available “anywhere and anytime’’. Users will surf the Web, check the email, download files or have real time videoconference, in a shopping mall, the airport, the city center or their homes. Beyond 3G mobile networks (B3G) (Jamalipour, 2005) will comprise a set of interrelated and rapidly growing wireless networks, applications which will require increasing bandwidth, and users who will demand high quality of service at low cost, all within a limited spectrum allocation. In these networks, the highly complex and heterogeneous Radio Access Network1 (RAN) will be composed of different technologies, such as GSM, UMTS and WLAN.
Until now, most operational tasks have been manually performed, requiring dedicated staff, with subsequent, inflexibility and delay of response. However, network operators are currently showing a growing level of interest in automating most network management activities. This has stimulated intense research activities in the field of self-managing networks (Pras, 2007; Kephart, 2003; Strassner, 2004). In this context, the self-managing property refers to the capability of the network to self-configure, self-protect, self-heal and self-optimize. All these issues have been the main driver behind recent studies dealing with automation and optimization of cellular networks (Halonen, 2003; Johnson, 2004; Lempiäinen, 2001; Laiho, 2002a).
In a mature cellular network that has undergone most of its site roll-out, the major cost is associated to the operation of the network. As the network consists of a high number of pieces of equipment that are distributed across the entire country, maintaining and operating this large and technically complicated system is a difficult task that requires operator personnel around the clock in several regional offices. For example, a GSM network in a typical European country may consist of about 10.000 sites. Due to the large size of the networks, it is common that some of the deployed pieces of equipment do not work as planned. The consequence of such problem is poor end-user service. As in most countries several operators are competing for subscribers, it is imperative to rectify such occurrences because otherwise users will be dissatisfied with the service and thus will likely switch to competing network operators. Hence, fault management, also called troubleshooting (TS), is a key aspect of operating a cellular system in a competitive environment. As the RAN of cellular systems is by far the biggest part of the network, most of the TS activities are focused on this area.