Strategy Selection and Outcome Evaluation of Change-Based Three-Way Decisions Based on Reinforcement Learning

Strategy Selection and Outcome Evaluation of Change-Based Three-Way Decisions Based on Reinforcement Learning

Copyright: © 2024 |Pages: 22
DOI: 10.4018/979-8-3693-1582-8.ch006
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this chapter, we enhance the trisecting-acting-outcome (TAO) model of three-way decision-making (3WD) with a novel approach for strategy selection and outcome prediction using Q-learning in reinforcement learning. We reinterpret the changes in tripartition and actions in the TAO model as states and actions in reinforcement learning, respectively. The reward is quantified using cumulative prospect theory, and the Q-learning algorithm iteratively determines action sets that achieve target rewards efficiently. This method offers a cost-effective and psychologically attuned action set for predicting the utility in change-based 3WD, demonstrated through a practical example.
Chapter Preview
Top

Introduction

The three-way decision (3WD) model is consistent with human cognition and offers a nuanced semantic framework for understanding decision-making processes. This approach categorizes decisions into three distinct categories: acceptance, non-commitment, and rejection, each derived from the positive, boundary, and negative regions of rough set approximations, respectively (Yao, 2009). In 2012, Yao refined this concept, proposing a model that divides a universal set into three segments to facilitate more precise classifications (Yao, 2012).

Over time, the scope of 3WD has expanded beyond its original probabilistic rough set basis, evolving into a broader conceptual framework, to form a complex research network (Yang & Li, 2019). This more expansive interpretation of 3WD emphasizes a triadic approach to thought and problem solving, a method prevalent across various disciplines and supported by cognitive theory. This approach simplifies complex problems by dividing them into three interconnected but distinct components, reducing cognitive and information overload (Yao, 2019). In its evolution and development, a large number of research results were produced (Fang & Min, 2019; Liu & Liang, 2016; Li & Huang, 2017; Min & Zhu, 2012; Qian & Liu, 2020; Wang & Yao, 2018; Zhang & Li, 2020, Zhang & Pang, 2020; Yu & Wang, 2020; Yang & Li, 2019; Li & Huang, 2017; Zhan & Wang, 2023; Wang & Li, 2022; Wang & Ma, 2022).

A notable example of this expansive interpretation is the trisecting-acting-outcome (TAO) model, where ‘trisecting’ refers to dividing the universal set into three parts, and ‘acting’ is the process of selecting and providing the optimal action that fulfills the result and maintains them at a moderate level. On the one hand, based on the required outcome, the decision maker designed appropriate strategies for the tripartition to achieve the expected results. On the other hand, strategies are selected to maintain the outcome at a moderate level. The outcome refers to the effectiveness of the results of trisecting and acting.

Complete Chapter List

Search this Book:
Reset