Decision Support Tool for the Agri-Food Sector Using Data Annotated by Ontology and Bayesian Network: A Proof of Concept Applied to Milk Microfiltration

Decision Support Tool for the Agri-Food Sector Using Data Annotated by Ontology and Bayesian Network: A Proof of Concept Applied to Milk Microfiltration

Cédric Baudrit, Patrice Buche, Nadine Leconte, Christophe Fernandez, Maëllis Belna, Geneviève Gésan-Guiziou
DOI: 10.4018/IJAEIS.309136
Article PDF Download
Open access articles are freely available for download

Abstract

The scientific literature is a valuable source of information for developing predictive models to design decision support systems. However, scientific data are heterogeneously structured expressed using different vocabularies. This study developed a generic workflow that combines ontology, databases and computer calculation tools based on the theory of belief functions and Bayesian networks. The ontology paradigm is used to help integrate data from heterogeneous sources. Bayesian network is estimated using the integrated data taking into account their reliability. The proposed method is unique in the sense that it proposes an annotation and reasoning tool dedicated to systematic analysis of the literature, which takes into account expert knowledge of the domain at several levels: ontology definition, reliability criteria and dependence relations between variables in the BN. The workflow is assessed successfully by applying it to a complex food engineering process: skimmed milk microfiltration. It represents an original contribution to the state of the art in this application domain.
Article Preview
Top

1. Introduction

For decision tasks such as optimising food processes, an initial step is to predict variables of interest from process parameters. The scientific literature, including experimental data and knowledge expressed by domain experts, is a valuable source of information to reach this goal. However, the ever-increasing amount of scientific data is heterogeneously structured, found mainly in text format and expressed using different vocabularies. Addressing this difficulty requires innovative tools that can integrate and treat new information. In this context, using Semantic Web methods based upon ontologies seem relevant to structure experimental information (Lousteau-Cazalet et al. 2016; Yeumo et al. 2017; Aubin et al. 2019). As experiments use different methods and technologies, another difficulty is considering source (document) reliability when using the data in calculations. The theory of belief functions provides suitable solutions to address this issue (Destercke et al. 2013). Providing relevant conclusions and recommendations requires developing adequate modelling tools that can integrate, as much as possible, available knowledge which is heterogeneous in nature and quality. Such modelling tools must be able to manage heterogeneous sources of knowledge (experimental data and expert opinion), multiple manipulated scales and different forms of uncertainty (Perrot et al. 2016; Barnabe et al. 2018). With this goal in mind, Bayesian networks (BNs) (Jensen and Nielsen, 2007; Pearl, 1988) provide a practical mathematical structure that can describe complex systems which contain uncertainty. BNs are based on a coupling between graph and probability theory in which the graph provides an intuitively appealing interface with which model designers can represent strongly interacting sets of variables. Uncertainty in the system is considered by quantifying the dependence between variables in the form of conditional probabilities. The use of BNs has been investigated recently in agri-food domains (Baudrit et al. 2015; Drury et al. 2017; Chapman et al. 2018).

This article discusses a numerical workflow to treat data and knowledge that combines ontologies, databases and computer calculation tools based on the theory of belief functions and BNs. The workflow developed is based on a pluridisciplinary collective study involving experts in the domains of food processing and artificial intelligence, and comprises three sequential steps (see Fig. 1). The first step consists of elicitation, structuring and assessment of knowledge related to a food process of interest. More precisely, experimental data published in scientific articles are annotated using an ontology, and their reliability is assessed by experts in food processing. Data from scientific articles are annotated in a simple tabular format file that is semi-automatically generated using the ontology (see step 1.1 in Fig. 1). Then, in step 1.2, the file is uploaded and annotated data are stored in a Resource Description Framework (RDF) database. The complete annotation data set used in this paper is available from (Buche et al. 2021) and the database can be queried in open access using a SPARQL Protocol and RDF Query Language (SPARQL) end-point. Relationships between variables from expert opinion are structured by a BN through its associated graph. The second step consists of extracting annotated data and associated reliability scores from the database using a dedicated querying system guided by the ontology to learn the BN parameters (i.e. conditional probability tables). The third step consists of reasoning via inference with the model developed in order to predict process parameters, which is an initial step in optimising the food process and thus the design of a decision support system. This is an iterative workflow which can be enriched with new data and knowledge without damaging the structure of the entire workflow.

Figure 1.

Workflow process developed in this study. RDF: Resource Description Framework, DB: database, BN: Bayesian network.

IJAEIS.309136.f01

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 2 Issues (2012)
Volume 2: 2 Issues (2011)
Volume 1: 2 Issues (2010)
View Complete Journal Contents Listing