Dynamic LIMIDS

Dynamic LIMIDS

Francisco J. Díez, Marcel A. J. van Gerven
DOI: 10.4018/978-1-60960-165-2.ch008
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

One of the objectives of artificial intelligence is to build decision-support models for systems that evolve over time and include several types of uncertainty. Dynamic limited-memory influence diagrams (DLIMIDs) are a new type of model proposed recently for this kind of problems. DLIMIDs are similar to other models in assuming a multi-stage process that satisfies the Markov property, i.e., that the future is independent of the past given the current state. The main difference with those models is the restriction of limited memory, which means that the decision maker must make a choice based only on recent observations, but this limitation can be circumvented by the use of memory variables. We present several algorithms for evaluating DLIMIDs, show a real-world model for a medical problem, and compare DLIMIDs with related formalisms, such as LIMIDs, dynamic influence diagrams, and POMDPs.
Chapter Preview
Top

Introduction

One of the main goals of artificial intelligence is to create computer programs that make optimal decisions in situations characterized by uncertainty. A particular family of problems of this type is to make decisions about a system that evolves over time. For instance, a patient suffering from a chronic disease goes to the doctor several times a year, and in each visit the doctor observes some symptoms and signs, orders some tests, and selects a treatment for the patient. In the same way, a robot must decide which action to perform at each moment, which may include analyzing an image, moving to a destination, picking up an object, recharging its batteries, or any combination of them. A similar problem would be the control of an industrial plant, in which the decisions might be to perform a test, to open or close a valve, to replace a component, etc.

One of the first modeling tools for decision analysis and decision support were probabilistic decision trees (Raiffa, 1968). Even though they are still the standard method in some fields, such as medicine and economy, they are very difficult to build, even for medium-size problems. For this reason, they are not considered as a suitable tool for artificial intelligence. Influence diagrams, initially proposed in the field of economy (Howard & Matheson, 1984), are much more powerful as a modeling tool, and for this reason they have become very popular in artificial intelligence. However, they are not suitable for solving temporal problems, such as the three examples presented above. One of the reasons is that influence diagrams require a total ordering of decisions, which is an unrealistic assumption in many situations in which the decision maker does not know in advance what decision should be made first to maximize the expected utility. A second drawback, especially important in the case of temporal reasoning, is that in an influence diagram each policy depends—at least in principle—on all the past observations. For a system that evolves over a large period of time, the number of observations usually grows linearly with the passing of time, which implies that the size of policies grows superexponentially.

In order to avoid these two limitations, Lauritzen and Nilsson (2001) presented limited-memory influence diagrams (LIMIDs) as an extension of influence diagrams. The term limited-memory name reflects the property that a finding known when making a decision is not necessarily “remembered” when making a posterior decision. Eliminating some variables from the domain of some policies may reduce the complexity of the model to the point of making it solvable with a computer, albeit at the price of reducing the quality of the policies.

In their original formulation, LIMIDs did not have an explicit representation of time, nor the possibility of indicating that the structure of the model repeats itself periodically. As a consequence, the size of a LIMID aimed at modeling a system that evolves over a long period of time would grow linearly with the duration of the process. In order to overcome this limitation, we developed DLIMIDs (dynamic LIMIDs) as a type of temporal model for decision analysis and decision support in problems with uncertainty and a periodic structure (van Gerven & Díez, 2006, van Gerven, Díez, Taal, & Lucas, 2007).

Independently, Markov decision processes (MDPs) were developed around the mid-20th century as a model for finding the optimal policy in situations where outcomes are partly random and partly under the control of a decision maker (Bellman, 1957). The main limitation of this model was the assumption that the state of the system is always known with certainty, which is unrealistic in most cases. The relaxation of this assumption led to the emergence of partially observable Markov decision processes (POMDPs) (Åström, 1965), in which the state of the system is not directly observable, but there is a variable that correlates probabilistically with it. The main problem of MDPs and POMDPs is that the variable that represents the state of the system may take a huge number of values, which complicates significantly the evaluation of the models. A partial remedy to this problem was to use several variables to represent the state of the system, rather than only one; this led to factored MDPs (Boutilier, Dearden, & Goldszmidt, 1995, 2000) and factored POMDPs (Boutilier & Poole, 1996).

Complete Chapter List

Search this Book:
Reset