Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Text Classification and Topic Modeling for Online Discussion Forums: An Empirical Study From the Systems Modeling Community

Xin Zhao, Zhe Jiang, Jeff Gray

Source Title: Trends and Applications of Text Summarization Techniques

DOI: 10.4018/978-1-5225-9373-7.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Online discussion forums play an important role in building and sharing domain knowledge. An extensive amount of information can be found in online forums, covering every aspect of life and professional discourse. This chapter introduces the application of supervised and unsupervised machine learning techniques to analyze forum questions. This chapter starts with supervised machine learning techniques to classify forum posts into pre-defined topic categories. As a supporting technique, web scraping is also discussed to gather data from an online forum. After this, this chapter introduces unsupervised learning techniques to identify latent topics in documents. The combination of supervised and unsupervised machine learning approaches offers us deeper insights of the data obtained from online forums. This chapter demonstrates these techniques through a case study on a very large online discussion forum called LabVIEW from the systems modeling community. In the end, the authors list future trends in applying machine learning to understand the expertise captured in online expert communities.

Chapter Preview

Top

1. Introduction And Background

Systems modeling is the process of developing abstract models that represent multiple perspectives (e.g., structural, behavioral) of a system. Such models also provide a popular way to explore, update, and communicate system aspects to stakeholders, while significantly reducing or eliminating dependence on traditional text documents. There are several popular systems modeling tools, such as Simulink (MathWorks, 2019) and LabVIEW (National Instruments, 2019).

Laboratory Virtual Instrument Engineering Workbench (LabVIEW) is a system-design platform and development environment for a visual programming language from National Instruments. LabVIEW offers a graphical programming approach that helps users visualize every aspect of the system, including hardware configuration, measurement data, and debugging. The visualization makes it simple to integrate measurement hardware from any vendor, represent complex logic on the diagram, develop data analysis algorithms, and design custom engineering user interfaces. LabVIEW is widely used in both academia (Ertugrul, 2000, 2002) and industry, such as Subaru Motor (Morita, 2018) and Bell Helicopter (Blake, 2015). There are more than 35,000 LabVIEW customers worldwide (Falcon, 2017).

Text summarization refers to the technique of extracting information from a large corpus of data and represents a common application area of machine learning and natural language processing. With the increasing production and consumption of date in all aspects of our lives, text summarization helps to reduce the time to digest and analyze information by extracting the most valuable and pertinent information from a very large dataset.

There are two main types of text summarization: extractive text summarization and abstractive text summarization. Extractive text summarization is a technique that pulls keywords or key phrases from a source document to infer the key points from original documents. Abstractive text summarization refers to the creation of a new document for summarizing the original document. The result of abstractive text summarization may include new words or phrases not in the original documents.

To understand the current best practices and tool-feature needs of the LabVIEW community, we collected user posts from the LabVIEW online discussion forum. An online discussion forum is a website where various individuals from different backgrounds can discuss common topics of interest in the form of posted messages. Online discussion forums are useful resources for sharing domain knowledge. The discussion forums can be used for many purposes, such as sharing challenges and ideas, promoting the development of community, and giving/receiving support from peers and experts. Several researchers have identified benefits of online discussion forums from different aspects, such as education (Jorczak, 2014), individual and society development (Pendry & Salvatore, 2015) and socialization (Akcaoglu & Lee, 2016). The LabVIEW discussion forum has very rich resources for text summarization because most of the user-generated content in the forums is text-based. We applied text classification based on supervised machine learning techniques and topic modeling based on unsupervised machine learning techniques to the large collection of LabVIEW forum posts. After downloading all the post questions through web scraping, we first used supervised machine learning to classify all the questions into four categories (i.e., “program”, “hardware”, “tools and support” and “others”). We compared three popular methods, including Multinomial Naive Bayes, Support Vector Machine and Random forest. After this, we applied unsupervised machine learning techniques to delve into the largest category (“program”) to find subtopics. In this chapter, we examine three unsupervised machine learning approaches: K-means clustering, hierarchical clustering and Latent Dirichlet Allocation (LDA). We use the LabVIEW discussion forum as our case study with empirical results.

The contributions of this chapter are two-fold. First, we demonstrate how text summarization techniques can be used to extract online discussion forum key information. Second, we describe future trends and research directions based on the analyses of text summarization results, which give direction toward future areas of investigation for the text summarization research community.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Text Classification and Topic Modeling for Online Discussion Forums: An Empirical Study From the Systems Modeling Community

Abstract

1. Introduction And Background

Complete Chapter List