Explainable Safety Risk Management in Construction With Unsupervised Learning

Explainable Safety Risk Management in Construction With Unsupervised Learning

DOI: 10.4018/978-1-6684-5643-9.ch011
(Individual Chapters)
No Current Special Offers


The success of Machine Learning (ML) approaches as promising solutions has encouraged their widespread implementation across different fields. Owing to the high accident rate, the construction industry embraced ML in the risk assessment procedure. What if the machine produces knowledge of the relationship between the risk features and accident outcomes contained in the safety dataset? What if machines can explain an accident dataset without human intervention? Unsupervised ML techniques offer several advantages over supervised approaches, including their explainability to analyze and understand complex datasets. This chapter demonstrates the practical implementation of the unsupervised learning method, clustering, and dimensionality reduction to explain the similarities, differences, variances, and patterns that exist between the feature spaces of an occupational safety risk dataset. Principal component analysis (PCA) and K-means clustering with silhouette analysis were selected as two unsupervised ML approaches to demonstrate their implementation in enhancing data-centric decision-making during the construction risk assessment procedure.
Chapter Preview


Construction projects are massive and complex operations involving an enormous number of interlinked processes and activities. The occupational risk assessment of a construction site includes the identification, localization, and assessment of all safety risks involved in each of these activities. Therefore, even small construction sites are often exposed to a wide range of occupational risk. Most of the identified occupational risks are recognized as high-impact and severe risks; hence, they must be addressed using appropriate risk-response strategies. Avoidance, mitigation, transfer, and control are the most common risk-response strategies. Their selection and implementation demand an accurate assessment of each response method in terms of its budget and time requirements, as well as its level of control over a particular safety risk. Effective mitigation and control of construction risks require allocating the main portion of the risk budget to the area that has the most influence on controlling most of the identified risk items. The more information obtained about construction risks, the less exposure the project will have. Therefore, an important managerial challenge is to identify these risk areas and evaluate their contributions to reducing workplace risk. Unsupervised ML approaches offer several advantages over supervised techniques, including cost-effectiveness, dimensionality reduction, exploratory data analysis, clustering, handling of missing data, robustness to outliers, and anomaly detection. One of the benefits of unsupervised ML techniques is their ability to analyze and understand complex datasets. Techniques such as PCA and K-means clustering can provide a useful explanation of the patterns inside the dataset without the need for manual labeling of the dataset. PCA localizes these risk areas and their contribution to the overall reduction in severity of construction accidents. K-means clustering follows an unsupervised process for classifying an unlabeled dataset into clusters and identifying abnormal data points within a given dataset.

Key Terms in this Chapter

Principal Component Analysis (PCA): An unsupervised method for identifying variables that persevere the maximum amount of variance in a dataset. PCA is useful for different applications such as dimensionality reduction, minimizing information loss, and increasing the explainability of high-dimensional datasets.

Supervised Machine Learning: Training ML algorithms with labeled input data for delivery of a particular task.

K-Means Clustering: An unsupervised technique for classifying observations within a dataset into a smaller number of clusters.

Silhouette Analysis: A visual method that evaluates the distance between clusters using the silhouette score.

Unsupervised Machine Learning: Training ML algorithms without labeled input data through an understanding of the patterns within the dataset.

Artificial Intelligence (AI): Modelling the processes of human intelligence using a machine system.

Machine Learning (ML): A subset of AI that learns the input dataset using different algorithms for a variety of tasks.

Complete Chapter List

Search this Book: