Review of Probability Elicitation and Examination of Approaches for Large Bayesian Networks

DOI: 10.4018/978-1-6684-7766-3.ch009
OnDemand:
(Individual Chapters)
Available
\$37.50
No Current Special Offers

Abstract

Probability elicitation is the process of formulating a person's knowledge and beliefs about one or more uncertain quantities into a joint probability distribution for those quantities. The important point is that the goal of elicitation is to capture a person's knowledge and beliefs based upon their current state of information. Consequently, the results of elicitation need be only good enough to make reasoned decisions or reasonable inferences. This chapter identifies how to elicit probabilities for large conditional probability tables in Bayesian networks. This chapter looks at Bayesian networks which are statistical models to describe and visualize in a compact graphical form the probabilistic relationships between variables of interest; the nodes of a graphical structure correspond to the variables, while directed edges between the nodes encode conditional independence relationships between them.
Chapter Preview
Top

Introduction

A Bayesian network (BN) is an acyclic directed graph in which the vertices or nodes represent random variables and the directed arcs indicate probabilistic dependence (Kuipers et al., 2022). Each random variable is defined by a set of mutually exclusive and exhaustive states. The arcs in the network define probabilistic dependence between pairs of variables; the direction of the arc indicates which conditional probability distribution has been captured. Probability theory prohibits cycles in such a network; a cycle is a sequence of arcs that starts at one node and leads back to the same node. Each variable then contains a probability distribution conditioned on the nodes having arcs into the node in question (called the parents of the node). For variables with discrete states the probabilistic dependence of a random variable on its parents is captured in a conditional probability table (CPT). For each unique combination of the states of the parent variables, a distribution over the states of the dependent variable is specified.

As displayed in Figure 1 four variables (or nodes) are shown for a model of the weather in order to predict the chance of light, moderate or heavy rain during a farmer’s planting season. The boxes show the probability distributions that are contained within each node. Note how the probability table expands as there are larger numbers of parents (entering arcs).

Figure 1.

Sample Bayesian Network Showing Conditional Probability Tables

BNs are rooted in statistics, computer science and artificial intelligence; and are structured using conditional probability and Bayes’ theorem, which capture dependency among system components (Hosseini & Ivanov, 2020). BNs are structured using conditional probability and Bayes’ theorem, which capture dependency among system components (Hosseini & Ivanov, 2020). Traditionally, Bayesian network structure learning is often carried out at a central site, in which all data is gathered. However, in practice, data may be distributed across different parties (e.g., companies, devices) who intend to collectively learn a Bayesian network but are not willing to disclose information related to their data owing to privacy or security concerns (Ng & Zhang, 2022). BN’s benefits are: (1) its high flexibility to model any causal relationships; (2) its capability to integrate information from any kind of sources, including experimental data, historical data and prior expert opinion and (3) its capability to answer probabilistic queries about them and to find out updated knowledge of the state of a subset of variables when other variables (the evidence variables) are observed (Rohmer, 2020).

For a BN, we may consider a network large if it contains variables with a large number of states, CPTs with a large number of distributions and/or a large number of variables. Acquiring numbers is only a part of the overall network engineering process. Experts are required to identify relevant variables, define their states and dependencies among the networks. These are commonly referred to as the structure of the network while the probability distributions are called the parameters of the network. We define the probability elicitation size of a network to be the number of probabilities that need to be elicited from experts.

Large CPTs present elicitors not only with the challenge of eliciting a large number of distributions that need to be consistent with one another, but also with simply navigating through large CPTs to compare closely related distributions with one another. Large CPTs also present problems for experts. The experts have the mental task of defining how multiple sets of states for several variables impact the conditional probabilities of the states for the target variable being assessed. See Table 1, which is taken from a notional model about whether Iran is creating a nuclear weapon.

Complete Chapter List

Search this Book:
Reset