Information Extraction for Call for Paper

Information Extraction for Call for Paper

Laurent Issertial (Osaka Prefecture University, Sakai, Japan) and Hiroshi Tsuji (Osaka Prefecture University, Sakai, Japan)
Copyright: © 2015 |Pages: 15
DOI: 10.4018/IJKSS.2015100103
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This paper proposes a system called CFP Manager specialized on IT field and designed to ease the process of searching conference suitable to one's need. At present, the handling of CFP faces two problems: for emails, the huge quantity of CFP received can be easily skimmed through. For websites, the reviewing of some of the main CFP aggregators available online points out the lack of usable criteria. This system proposes to answer to these problems via its architecture consisting of three components: firstly an Information Extraction module extracting relevant information (as date, location, etc...) from CFP using rule based text mining algorithm. The second component enriches the now extracted data with external one from ontology models. Finally the last one displays the said data and allows the end user to perform complex queries on the CFP dataset and thus allow him to only access to CFP suitable for him. In order to validate the authors' proposal, they eventually process the well-known precision / recall metric on our information extraction component with an average of 0.95 for precision and 0.91 for recall on three different 100 CFP dataset. This paper finally discusses the validity of our approach by confronting our system for different queries with two systems already available online (WikiCFP and IEEE Conference Search) and basic text searching approach standing for searching in an email box. On a 100 CFP dataset with the wide variety of usable data and the possibility to perform complex queries we surpass basic text searching method and WikiCFP by not returning the false positive usually returned by them and find a result close to the IEEE system.
Article Preview

Introduction

With today's technology any type of information accessibility has greatly increased and this is also true for our domain of study: Call For Papers (CFP). It has never been so easy to find information on conference usually via numerous websites or simply by email. However, this profusion of information has also its downside. Handling CFP can be seen as a really time consuming task. Regarding the emails, many CFP are received every day and most of them will only be skimmed through or forgotten by lack of time or attention. As for the web, it is possible to find many websites where you can browse a wide range of CFPs such as WikiCFP (2014) or ConferencePartner (2014), or others bound to different institutions like IEEE (2014) or ACM (2012). Nevertheless these sites have several weak points. They do not allow the user to use CFPs obtained by external sources (e.g. Email) ; thus a new research have to be done on a totally different database, sometimes with less information you could actually find in your CFP email. Moreover, conferences research parameters can be really basic and not handle a request more complex than searching different strings of characters (e.g. conference about X topic in Y country with a Z deadline). Those are the issues we propose to handle with our system: CFP Manager.

In one of our previous work (Issertial & Tsuji, 2011), we proposed the concept of a text mining system able to extract relevant information from a group of CFP. In another one concerning visualized comparison of CFP datasets (Issertial, Saga, & Tsuji, 2012), we introduced the idea of extracted data enrichment via the utilization of ontology, another one (Issertial & Tsuji, 2013) focused on the query and interface system along with the different kind of output that can be obtained from the set of relevant data previously collected. This paper proposes to focus on the case of a final user who’s looking for information about conferences suited to him and how our proposed system will ease this process. We will do so by reviewing current web-services proposing this functionality and comparing their theoretical results with our proposed system on different queries. This will be introduced along with the description of our system and numerical experimentations proving its good working. CFP will be subject to rule based text mining algorithms in order to extract relevant information. These very data will be enriched via ontology models with concepts close to the ones found in the CFP. Finally, using the previously collected data, we form the base of a user interface system with an intuitive query system allowing the user to perform complex queries.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2017): 3 Released, 1 Forthcoming
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing