Article Preview
TopIntroduction
In our daily life, we are surrounded by a community of living organisms in conjunction with the nonliving components. These last keeps amazing us by many facts, either the ones that help for surviving, or disasters that threaten the ecosystem on earth. One of the most disaster that threaten human life and causes a huge damage is earthquakes.
Mining hazards is a subfield of mining activities connected to the dangers. They are the causes of disasters and accidents; mining hazards plays an important role in shaping industrial safety in coal mines. Similar to an earthquake, detection and prediction of seismic hazards present the hardest issue of natural hazards detection. Seismic activity and seismic hazard in underground coal mines occur in case of specific structure of geological deposit and the way of exploitation of coal. The nature of these hazards is influenced by a large number of factors which causes a complex and insufficiently recognized relationships among them. One example of a situation, with a particularly strong intensity, occurs in the Upper Silesian Coal Basin where there are additional conditions connected with: multi-seam structure of deposit, consequences of the long history of exploitation of this area and complex surface infrastructure. In almost all mines of this area there are systems which detect and assess a current degree of seismic hazard (Kabiesz, 2006). Hazard of high-energy destructive tremor which may result in a rock burst is a particular case of one of the major studies of coal mine geophysical stations work. As a phenomenon related with mining seismicity, Rock bursts pose a serious hazard to miners and can destroy long walls and the equipment.
Data engineering and knowledge discovery appeared in the information era, where the explosion of data amount and its exponential growth in volume allowed to data mining techniques to take an important role in daily civilian life, including scientific data exploration. One of these applications is the classification of seismic hazards in coal mines, which is the aim of this work. Classification of seismic hazards in coal mines is divided on two main basic techniques that will be discussed in 2.1. Seismic hazards prediction does not mean only prediction of danger cases but also the normal cases, the problem here is problem of gaining of geologists’ confidence that seems difficult because of the amount of false alarms, this is simply a prediction of normal case as dangerous case, which means a loss of money and time in surveillance for this case by geologists.
The problem with real life cases such as seismic hazards is the infrequent apparition of some important cases compared to other cases. Taking seismic hazards for example, more than 90% of recorded data are presenting non-hazardous cases that means no danger while the main goal for researchers is prediction of dangerous cases. This inequivalent in class distribution causes difficulties in learning from the recorded data in order to develop an intelligent system that predicts the danger seismic in coal mines. Figure 1 shows the percentage of distribution of hazardous cases against non-hazardous ones.
Figure 1. The percentage of non-hazardous against hazardous cases
In case like seismic data set, the probability of classifying a hazardous case as non-hazardous will be very high. Taking a simple example, classification of hazardous cases using k-Nearest Neighbors (KNN) algorithm, finding majority of neighbors belong to non-hazardous cases has more chances than finding them belong to hazardous cases. This problem can be found using any classification algorithm, which results a high error value for prediction of dangers in seismic in coal mines.
This paper gave an informed sampling approach for dealing with unbalanced seismic data set based on k-means clustering technique. The remainder of the paper is organized as following: section 2 showed a state of the art where there are details of all aspects touched in this work beginning with the two basic techniques for classification of seismic hazards in coal mines followed by some related works for seismic hazards detection, then the next subsection shows the most known solutions in literature for dealing with unbalanced data. Section 3 details the different steps of our proposed approach and the metrics used for evaluation of the approach, while section 4 shows the obtained results by the approach. And finally, major conclusions are given in section 5.