Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Speech for Content Creation

Joseph Polifroni, Imre Kiss, Stephanie Seneff

Source Title: International Journal of Mobile Human Computer Interaction (IJMHCI) 3(2)

DOI: 10.4018/jmhci.2011040103

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This paper proposes a paradigm for using speech to interact with computers, one that complements and extends traditional spoken dialogue systems: speech for content creation. The literature in automatic speech recognition (ASR), natural language processing (NLP), sentiment detection, and opinion mining is surveyed to argue that the time has come to use mobile devices to create content on-the-fly. Recent work in user modelling and recommender systems is examined to support the claim that using speech in this way can result in a useful interface to uniquely personalizable data. A data collection effort recently undertaken to help build a prototype system for spoken restaurant reviews is discussed. This vision critically depends on mobile technology, for enabling the creation of the content and for providing ancillary data to make its processing more relevant to individual users. This type of system can be of use where only limited speech processing is possible.

Article Preview

Top

Introduction

A couple are visiting Toronto and have just finished a meal at a small Chinese restaurant. The wife makes a habit of scouting out Chinese food in any city she visits and this restaurant was particularly good. As she walks out of the restaurant, she pulls out her mobile phone, clicks a button on the side, and speaks her thoughts about the meal she’s just eaten. She then puts her phone away, having recorded her impressions of the restaurant. Her location and the time of day have been recorded as part of the interaction. Our hypothetical user then hails a cab and goes off to the theater. Figure 1 shows what a user might say in this context.

Figure 1.

A representation of how a user might create content via speech

The scenario we describe above is the first-stage interaction with an overall system that uses speech for content creation, social media, and recommender systems. In subsequent sections, we will enlarge upon this scenario, with further glimpses into the user interaction and the underlying technology required for each step. We argue that these technologies are sufficiently advanced to enable the convenience of recording thoughts and impressions on the go, indexing the results, and extracting enough information to make the interaction useful for others.

One of the most important aspects of this scenario, and the ones that follow, is that the user is in charge of the interaction the entire time. Users do not have to worry about getting involved in an interaction when they’re busy, in a noisy environment, or otherwise unable to devote time to the interface. Users can describe an experience while it is fresh in their memory through an interface that is always available to them. When they have the time and the inclination to make further use of the information, they can examine, review, and, ultimately, share it. The spoken input takes the form of a “note to self,” where the user does not have to plan carefully what to say (Figure 2).

Figure 2.

A schematic representation of data capture and processing in the restaurant review scenario

In this initial scenario, the user’s interaction with the system stops after the review is spoken. Either immediately, or when connectivity is reestablished, speech is uploaded to a cloud-based system. With a combination of automatic speech recognition (ASR) and natural language processing (NLP) technologies, the system goes to work on indexing and deriving meaning from the dictated review. In the best case scenario, information about individual features, such as food quality or service, are extracted and assigned a scalar value based on user input. These values are used to populate a form, combined with other online sources of information (derived from GPS coordinates associated with the speech at the time of data collection), and made available to the user to review, modify, and share. Various other fallback levels of analysis are always available, so that the information is never completely lost or ineffectual. For example, the system may be able to only assign a single overall polarity to the entire review, or just extract some keywords for indexing. In the worst case, a simple audio file is saved and associated with a time-stamp and GPS location. The user remains unaware of this processing, which need not be real-time. Further input will come later, at the discretion of the user. Figure 2 shows how this process might unfold.

Speech for content creation has several characteristics that make it attractive from a technological perspective:

•
It does not have to be real-time. As our scenarios illustrate, the user simply speaks to a mobile device to enter content. Any further interaction takes place at the convenience of the user.
•
It does not involve a detailed word-by-word analysis of the input. As we will show, further processing of the text can be done using just keywords/phrases in the user’s input.
•
It can be designed with multiple fallback mechanisms, such that any step of the process can be perceived as useful and beneficial to the user.

Complete Article List

Search this Journal:

Reset

Volume 16: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 15: 1 Issue (2023)

Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming

Volume 13: 1 Issue (2021)

Volume 12: 3 Issues (2020)

Volume 11: 4 Issues (2019)

Volume 10: 4 Issues (2018)

Volume 9: 4 Issues (2017)

Volume 8: 4 Issues (2016)

Volume 7: 4 Issues (2015)

Volume 6: 4 Issues (2014)

Volume 5: 4 Issues (2013)

Volume 4: 4 Issues (2012)

Volume 3: 4 Issues (2011)

Volume 2: 4 Issues (2010)

Volume 1: 4 Issues (2009)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Speech for Content Creation

Abstract

Introduction

Complete Article List