Towards Ensemble Learning for Tracking Food Insecurity From News Articles

Towards Ensemble Learning for Tracking Food Insecurity From News Articles

Andrew Lukyamuzi, John Ngubiri, Washington Okori
Copyright: © 2020 |Pages: 14
DOI: 10.4018/IJSDA.2020100107
Article PDF Download
Open access articles are freely available for download

Abstract

The study integrates ensemble learning into a task of classifying if a news article is on food insecurity or not. Similarity algorithms were exploited to imitate human cognition, an innovation to enhance performance. Four out of six classifiers generated performance improvement with the innovation. Articles on food insecurity identified with best classifier were generated into trends which were comparable with official trends. This paper provides information useful to stake holders in taking appropriate action depending on prevailing conditions of food insecurity. Two suggestions are put forth to promote performance: (1) using articles aggregated from several news media and (2) blending more classifiers in an ensemble.
Article Preview
Top

Introduction

Ensemble Learning, the process of blending two or more algorithms to promote performance is now at the frontier in field of Machine Learning. The Netflix winner of 1 million dollar prize used an ensemble learning of over 100 models (Andreas, Jahrer, Bell, & Park, 2009). The winner had been required to improve accuracy by 10% of Netflix’s system for user movie recommendations (Hallinan & Striphas, 2016). The challenge attracted over 50,000 contestants from 186 countries (NETFLIX, 2009). The second best contestant also employed Ensemble Learning (Feuerverger, He, & Khatri, 2012). In the Kaggle competition of 2015, the winner for forecasting six weeks of store sales used ensemble learning. Using an ensemble of residual nets, He, Zhang, Ren, & Sun (2015) won 1st place on the ILSVRC 2015 classification task. To emphasize further the power of ensemble learning, Wind & Winther (2014) assert that Kaggle competitions are frequently won by competitors who integrate some aspect of ensemble learning. The power of ensemble learning/blended approach has been exhibited in other human endeavors (for example see Abdel Aziz et al., 2016; Bouzaida & Sakly, 2018; Hussein et al., 2017; Majhi, 2018). From this, it can be inferred that high-end performance is highly probable with ensemble learning.

Deep learning has also come on the stage in comparison to traditional Machine Learning. A distinguishing feature of Deep Learning is the ability to automatically extract features as opposed to manual feature engineering as it is in traditional Machine Learning. Deep Learning uses multiple layers of non-linear operations to offer state-of-the-art systems such as in vision and language tasks (Ren, Zhang, & Suganthan 2016). Because of its tendencies to generate better performance than traditional Machine Learning, deep learning is increasingly becoming preferable (W. Li, Liu, Zhang, & Liu, 2019). Despite the differences between these Machine Learning variations, both can be enhanced by ensemble learning (Y. Li, Wang, & Xu, 2018) . This research is on integrating some aspect ensemble learning in context of Deep Learning against traditional Machine Learning to unveil any possible contrasting insights.

A possible approach to ensemble learning is to integrate existing standard algorithms. For example:(1) CNN was combined with RNN to predict better crop yield prediction(Khaki, Wang, & Archontoulis, 2020); (2) better prediction of soya bean yield was observed when CNN was integrated with LSTM (Sun, Di, Sun, Shen, & Lai, 2019); and (3) Adaboost generated improved text analysis with datasets such as Reuters-21578 (Bloehdorn & Hotho, 2006). This approach however has not always guaranteed performance improvement (W. Li et al., 2019).

Another approach towards ensemble learning is to introduce an innovation and blend it with existing algorithms. For example: (1) ontology was integrated with naive Bayes classifier to enhance classification(Chang & Huang, 2008), and (2)Gaussian function integrated into CNN predicted better bean yield (Sabini, Rusak, & Ross, 2017). This second approach also a suffers a similar a fate as the first; innovations introduced provide no guarantee for performance improvement. A common opportunity in both approaches is the wide continuum of un explored options though this does not interfere with researchers interested in fixing failed cases on the two approaches. This research is towards unexplored option in the second approach.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 11: 5 Issues (2022)
Volume 10: 4 Issues (2021)
Volume 9: 4 Issues (2020)
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2014)
Volume 2: 4 Issues (2013)
Volume 1: 4 Issues (2012)
View Complete Journal Contents Listing