Mining Project Failure Indicators From Big Data Using Machine Learning Mixed Methods

Mining Project Failure Indicators From Big Data Using Machine Learning Mixed Methods

Kenneth David Strang (RMIT, Australia & W3 Research, USA) and Narasimha Rao Vajjhala (American University of Nigeria, Nigeria)
Copyright: © 2023 |Pages: 24
DOI: 10.4018/IJITPM.317221
Article PDF Download
Open access articles are freely available for download


The literature revealed approximately 50% of IT-related projects around the world fail, which must frustrate a sponsor or decision maker since their ability to forecast success is statistically about the same as guessing with a random coin toss. Nonetheless, some project success/failure factors have been identified, but often the effect sizes were statistically negligible. A pragmatic mixed methods recursive approach was applied, using structured programming, machine learning (ML), and statistical software to mine a large data source for probable project success/failure indicators. Seven feature indicators were detected from ML, producing an accuracy of 79.9%, a recall rate of 81%, an F1 score of 0.798, and a ROCa of 0.849. A post-hoc regression model confirmed three indicators were significant with a 27% effect size. The contributions made to the body of knowledge included: A conceptual model comparing ML methods by artificial intelligence capability and research decision making goal, a mixed methods recursive pragmatic research design, application of the random forest ML technique with post hoc statistical methods, and a preliminary list of IT project failure indicators analyzed from big data.
Article Preview


Approximately half of Information Technology (IT)-related projects around the world have failed (Kurek, Johnson, & Mulder, 2017; Masticola, 2007; Strang, 2021). In 2009 the U.S.-based Standish Group (2009) found only 32% of projects in the American government were successful, the remaining 68% were challenged or an outright failure. In European Union countries, a 50% procurement project failure rate was discovered from the large rigorous seminal study by Ghossein, Islam, and Saliola (2018). The nearly 50% project failure rate was corroborated in two large rigorous empirical U.S. government-based studies, with no statistical evidence found to account for the problems (Borbath, Blessner, & Olson, 2019; Eckerd & Snider, 2017). Pace (2019) argued that U.S. IT-related project failure rates have remained steady over at least 20 years despite significant advances in software and methodologies. A high project failure rate even up to 90% may be expected in industries such as R&D or space exploration, but not in IT. Israel (2012, p. 76), a former project leader at the U.S. Federal Bureau of Investigations, reviewed decades of IT-related public projects from an insider perspective, and he wrote this surreal synthesis “the federal government has wasted billions of taxpayer dollars on failed projects.” This 50/50 gamble of project success vs. failure must leave stakeholders feeling perplexed about why analytical approaches have improved other fields like medical drug prediction (e.g., cancer, COVID-19, etc.), yet an IT project sponsor’s ability to forecast success is statistically about the same as guessing with a random coin toss.

From a researcher perspective, it seems unusual that only a few of the project management-related journals have published empirical studies to explore the high project failure rate in an effort to improve the body of knowledge. The references illustrate which journals are championing the scientific search for this elusive answer. The authors felt it was frustrating that some journals predominately published single case studies of so-called megaprojects (i.e., large projects). Most often the goal of a single case study was to discuss a project success, not a failure, at one site. The problem with those studies was that the results were speculative and difficult to generalize, such as studying risk management at a large global oil platform in an oligopolistic market. It was not clear if the findings were statistically significant and more so any results would generalize only to equivalent populations, namely other oil rigs in the ocean. Other journals have favored surveys or interviews to collect perceptions of failure. Three problematic issues with those survey data collection approaches were poor designs, common method bias (no triangulation of evidence) and asking opinions of project performance instead of collecting actual metrics.

On the positive side, some empirical studies have revealed what is causing projects to fail. Attributes such as ISO quality approval, years of experience, prior project duration, communication skills, leadership, project manager (PM) certification, gender, corruption and incompetency ─ ineffective project management ─ have been found to impact project outcomes (Anthopoulos, Reddick, Giannakidou, & Mavridis, 2016; Jennings, Lodge, & Ryan, 2018; Laurie, Rana, & Simintiras, 2017; Martinez-Perales, Ortiz-Marcos, Ruiz, & Lazaro, 2018; Ngonda & Jowah, 2020; Pace, 2019; Saadé, Dong, & Wan, 2015; Strang, 2021). The problem with those empirical studies was the small effect sizes which means when a causal factor was identified the practical impact was negligible, leaving 88-98% variation unaccounted for. For decision makers, this means the significant models of project failure have a small economic utility as compared to the unknown factors. For other stakeholders including higher education professors, project management practitioners, and IT management associations, those small effect sizes were not enough to justify amendments to the body of knowledge.

Complete Article List

Search this Journal:
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing