This paper describes a new class of neuro-fuzzy models, called Reinforcement Learning Hierarchical Neuro- Fuzzy Systems (RL-HNF). These models employ the BSP (Binary Space Partitioning) and Politree partitioning of the input space [Chrysanthou,1992] and have been developed in order to bypass traditional drawbacks of neuro-fuzzy systems: the reduced number of allowed inputs and the poor capacity to create their own structure and rules (ANFIS [Jang,1997], NEFCLASS [Kruse,1995] and FSOM [Vuorimaa,1994]). These new models, named Reinforcement Learning Hierarchical Neuro-Fuzzy BSP (RL-HNFB) and Reinforcement Learning Hierarchical Neuro-Fuzzy Politree (RL-HNFP), descend from the original HNFB that uses Binary Space Partitioning (see Hierarchical Neuro-Fuzzy Systems Part I). By using hierarchical partitioning, together with the Reinforcement Learning (RL) methodology, a new class of Neuro-Fuzzy Systems (SNF) was obtained, which executes, in addition to automatically learning its structure, the autonomous learning of the actions to be taken by an agent, dismissing a priori information (number of rules, fuzzy rules and sets) relative to the learning process. These characteristics represent an important differential when compared with existing intelligent agents learning systems, because in applications involving continuous environments and/or environments considered to be highly dimensional, the use of traditional Reinforcement Learning methods based on lookup tables (a table that stores value functions for a small or discrete state space) is no longer possible, since the state space becomes too large. This second part of hierarchical neuro-fuzzy systems focus on the use of reinforcement learning process. The first part presented HNFB models based on supervised learning methods. The RL-HNFB and RL-HNFP models were evaluated in a benchmark control application and a simulated Khepera robot environment with multiple obstacles.
Hierarchical Neuro-Fuzzy Systems
This section presents the new class of neuro-fuzzy systems that are based on hierarchical partitioning. As mentioned in the first part, two sub-sets of hierarchical neuro-fuzzy systems have been developed, according to the learning process used: supervised learning models (HNFB [Souza,2002][Vellasco,2004], HNFB-1 [Gonçalves,2006], HNFB-Mamdani HNFB-Mamdani [Bezerra,2005]); and reinforcement learning models (RL-HNFB [Figueiredo,2005a], RL-HNFP RL-HNFP [Figueiredo,2005b]). The focus of this article is on the second sub-set of models. These models are described in the following sections.
Key Terms in this Chapter
Reinforcement Learning: A sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Differently from supervised learning, in this case there is no target value for each input pattern, only a reward based of how good or bad was the action taken by the agent in the existant environment.
Politree Partitioning: The Politree partitioning was inspired by the quadtree structure, which has been widely used in the area of images manipulation and compression. In the politree partitioning the subdivision of the n-dimensional space is accomplished by m=2n subdivision. The Politree partitioning can be represented by a tree structure where each node is subdivided in m leafs (Politree partitioning).
Binary Space Partitioning: In this type of partitioning, the space is successively divided in two regions, in a recursive way. This partitioning can be represented by a binary tree that illustrates the successive n-dimensional space sub-divisions in two convex subspaces. The construction of this partitioning tree (BSP tree) is a process in which a subspace is divided by a hyper-plan parallel to the co-ordinates axes. This process results in two new subspaces that can be later partitioned by the same method.
Fuzzy Inference Systems: Fuzzy inference is the process of mapping from a given input to an output using fuzzy logic. The mapping then provides a basis from which decisions can be made, or patterns discerned. Fuzzy inference systems have been successfully applied in fields such as automatic control, data classification, decision analysis.
Machine Learning: Concerned with the design and development of algorithms and techniques that allow computers to “learn”. The major focus of machine learning research is to automatically extract useful information from historical data, by computational and statistical methods.
Quadtree Partitioning: In this type of partitioning, the space is successively divided in four regions, in a recursive way. This partitioning can be represented by a quaternary tree that illustrates the successive n-dimensional space sub-divisions in four convex subspaces. The construction of this partitioning tree (Quad tree) is a process in which a subspace is divided by a two hyper-plan parallel to the co-ordinates axes. This process results in four new subspaces that can be later partitioned by the same method. The limitation of the Quadtree partitioning (fixed or adaptive) is in the fact that it works only in two-dimensional spaces.
Sarsa: It is a variation of the Q-learning (Reinforcement Learning) algorithm based on model-free action policy estimation. SARSA admits that the actions are chosen randomly with a predefined probability.
WoLF: (“Win or Learn Fast”) is a method by [Bowling 2002] for changing the learning rate to encourage convergence in a multi-agents reinforcement learning scenario.