Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Matching Word-Order Variations and Sorting Results for the iEPG Data Search

Denis Kiselev, Rafal Rzepka, Kenji Araki

Source Title: International Journal of Multimedia Data Engineering and Management (IJMDEM) 5(1)

DOI: 10.4018/ijmdem.2014010104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This paper describes using a finite-state automaton (FSA) to retrieve Japanese TV guide text. The proposed FSA application can be considered novel due to lack of research on the subject. The automaton has been implemented for matching and extracting all possible combinations of search query words in all possible word orders that may be present in the TV guide text. This implementation also sorts the extraction results by analyzing word semantic features (such as “being an object” or “being a property of an object”). The present paper also proposes a search system using the above implementation and compares it with a baseline system that matches query words (of multi-word queries) in exactly the same and exactly the opposite word orders only. Both systems use morphological parsing and apply a stop list to the query. A multi-parameter evaluation has shown advantages of the proposed system over the baseline one.

Article Preview

Top

Motivation For This Research

Japanese is written without spaces between words. That means a search system processing this language needs to “know” what character strings are words, or at least where character strings that could be words start and end. It is even better if a system attempts to find out what those words, or groups of characters, may mean. The same is true for searching the Japanese language iEPG (Internet Electronic Program Guide or, simply, Web pages saying when, what programs are shown on TV).

It can be concluded from the output of search systems available on major Japanese iEPG websites¹ that those systems most likely apply the direct matching technique to the query, treated by them as a character string. In other words, they most likely match the search phrase without segmenting it into words (i.e. without morphological parsing and inserting spaces at word boundaries).

Kiselev et al. (2013) suggested improvements to that technique and proposed an iEPG search system utilizing morphological parsing and the core meaning analysis for matching the search query with the TV guide text.

The above authors also demonstrated how using that system could improve search results, however matching query words in all possible orders was left for future work.

The system proposed by the above authors can match query words (provided the query has two or more of them) in exactly the same or exactly the opposite orders only (ibid.). For two-word queries “exactly the same” and “exactly the opposite” are all the possible word order options, however there are more options for longer queries. Thus, the system will successfully match text with “観光地は人気で綺麗 ([kankouchi wa ninki de kirei] the sightseeing spot is popular and beautiful)”² in response to the query “綺麗で人気な観光地 ([kirei de ninki na kankouchi] a beautiful and popular sightseeing spot)”, but will not match the same text in response to “人気で綺麗な観光地 ([ninki de kirei na kankouchi] a popular and beautiful sightseeing spot)”.

This ability to express (practically) the same meaning using the same words in various orders is described as a characteristic feature of “context-free languages”, i.e. ones allowing more flexible word combinability, by Maruoka (2011). The order flexibility in Japanese word combinations is illustrated in terms of the “context-free grammar” and NLP (Natural Language Processing) by Tanabe, Tomiura and Hitaka (2000).

Implementing a system capable of matching query words in all possible orders characteristic of the Japanese language, has been the primary motivation for the research described in this paper. The system proposed by Kiselev et al (2013) (mentioned earlier in this section) has been used as a baseline.

It should be noted that both the baseline and proposed system implementations are essentially different form large search engines, such as Google. First, large search engines retrieve web documents, such as websites and parts of them, whereas the proposed and the baseline implementations retrieve pieces of text that describe TV programs. To do so, the implementations do not require indexing millions of web documents (the way a Google webpage³ says it does) and do not need any corpora, such as the approximately 24-GB large Google N-gram Corpus described by Lin et al (2010). Because of their size the implementations could be used locally as, say, internal search systems for TV sets. It seems unlikely that, for instance, the Google search engine can be used in the same way. It has been our purpose to develop the search system for the TV program guide by taking into account the above peculiarities of this task.

Top

Input-Output Flow Of The Proposed System

This section contains a concise flow description. Sections that follow explain flow stages in more detail.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Matching Word-Order Variations and Sorting Results for the iEPG Data Search

Abstract

Motivation For This Research

Input-Output Flow Of The Proposed System

Complete Article List