Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Incorporating Qualitative Information for Credit Risk Assessment through Frequent Subtree Mining for XML

Novita Ikasari, Fedja Hadzic, Tharam S. Dillon

Source Title: Small and Medium Enterprises: Concepts, Methodologies, Tools, and Applications

DOI: 10.4018/978-1-4666-3886-0.ch025

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Credit risk assessment has been one of the most appealing topics in banking and finance studies, attracting both scholars’ and practitioners’ attention for some time. Following the success of the Grameen Bank, works on credit risk, in particular for Small Medium Enterprises (SMEs), have become essential. The distinctive character of SMEs requires a method that takes into account quantitative and qualitative information for loan granting decision purposes. In this chapter, we first provide a survey of existing credit risk assessment methods, which shows a current gap in the existing research in regards to taking qualitative information into account during the data mining process. To address this shortcoming, we propose a framework that utilizes an XML-based template to capture both qualitative and quantitative information in this domain. By representing this information in a domain-oriented way, the potential knowledge that can be discovered for evidence-based decision support will be maximized. An XML document can be effectively represented as a rooted ordered labelled tree and a number of tree mining methods exist that enable the efficient discovery of associations among tree-structured data objects, taking both the content and structure into account. The guidelines for correct and effective application of such methods are provided in order to gain detailed insight into the information governing the decision making process. We have obtained a number of textual reports from the banks regarding the information collected from SMEs during the credit application/evaluation process. These are used as the basis for generating a synthetic XML database that partially reflects real-world scenarios. A tree mining method is applied to this data to demonstrate the potential of the proposed method for credit risk assessment.

Chapter Preview

Top

Introduction

The emerging need for methods of credit risk assessment for Small Medium Enterprises’ loan applications presents a unique challenge to the knowledge discovery and data mining field. The present credit scoring methods are considered not viable for SMEs since they are constructed from characteristics and risks pertaining to large scale business. In addition, SMEs are known for their imprecise management style, having non-systematic bookkeeping and organization of the business. This leads to a lack of valid and reliable financial information in traditional form (Berger, Klapper, & Udell, 2001; Berger & Udell, 1995) which is currently needed for the assessment of loan applications. In order to overcome the problem, loan staffs are required to collate data using a qualitative data collection method, namely interviews and observations. Therefore, a good portion of information on loan applications is available in a qualitative rather than quantitative form.

The abundant studies on credit scoring have contributed to credit risk methods being constructed using statistical and machine learning techniques. Aside from these mainstream techniques, our survey of the existing literature shows that a small number of researches have conducted studies using hybrid methods. Although each method shows respectable performance in classifying good and bad loan applications, each has inherent weaknesses. This, among others, is due to the fact that they are constructed using quantitative data which results in limited applicability of such a method in the real world of SMEs. Recent studies on credit risk assessment of SMEs highlight the necessity of incorporating qualitative information into the method (e.g., Dinh & Kleimeier, 2007). The level of qualitative data on SMEs loan applications is variable in both quality and quantity. There are elements which could impact upon decisions regarding loan applications which are conspicuously higher in qualitative nature than others; these are goodwill, competency and integrity. These three characteristics require adequate elaborations since answers to these questions can only be understood by inference rather than a direct response.

This qualitative information is mainly available in free form text, which poses additional complications as most of the well developed and explored statistical and machine learning (data mining) methods are applied mainly to relational data with a well-defined structure. The task at hand is to develop a technique that incorporates and analyses qualitative information in tandem with quantitative information so that it accurately discloses applicants’ credit risks. We propose a way to capture the qualitative information in a domain oriented way by defining an XML based template. We will show how the relevant information from the documents used by the banks for assessing credit risk for SME loan applications can be effectively captured using the proposed template. Within this context, preliminary results of the pre-defined XML template that are generated from a small number of textual document instances will be presented.

The main problem in association rule mining of semi-structured documents such as XML, is that of frequent pattern discovery, where a pattern in this case corresponds to a subtree. This is known as the frequent subtree mining problem, in which given a tree database T_DB and minimum support threshold (σ), the goal is to find all subtrees that occur at least σ times in T_DB. Driven by different application needs, several frequent subtree mining algorithms have been proposed in the literature that can mine different subtree types using different support definitions and constraints (Chi, Yang, & Muntz, 2005; Hadzic, Tan, & Dillon, 2010; Nijssen & Kok, 2003; Tan, Dillon, Hadzic, Feng, & Chang, 2006; Tan, Hadzic, Dillon, Feng, & Chang, 2008b; Zaki, 2005). We provide guidelines for a correct and effective application of frequent subtree mining methods and the implications of using different frequent parameters (i.e., subtree types and support definitions) in the credit risk assessment domain. The documents from the banks are used to generate the synthetic XML database to demonstrate the usefulness and potential of the proposed approach.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Incorporating Qualitative Information for Credit Risk Assessment through Frequent Subtree Mining for XML

Abstract

Introduction

Complete Chapter List