Service Class Driven Dynamic Data Source Discovery with DynaBot

Service Class Driven Dynamic Data Source Discovery with DynaBot

Daniel Rocco, James Caverlee, Ling Liu, Terence Critchlow
Copyright: © 2007 |Pages: 23
DOI: 10.4018/jwsr.2007070102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Dynamic Web data sources on the Deep Web provide intuitive access to real-time information and large data repositories anywhere that Web access is available. Although recent studies suggest that the dynamic Web is larger and growing faster than static Web, dynamic content is often ignored by existing search engine indexers owing to technical challenges inherent in searching dynamic sources. To address these challenges, we present DynaBot, a service-centric crawler for discovering and clustering Deep Web sources. Dyna- Bot has three unique characteristics. First, DynaBot utilizes a service class model implemented through the construction of service class descriptions (SCDs). Second, DynaBot employs a modular architecture for focused crawling of the Deep Web. Third, DynaBot incorporates algorithms for efficiently probing, discovering, and clustering Deep Web sources through SCD-based service analysis. Experimental results demonstrate DynaBot’s effectiveness and suggest techniques for efficiently managing service discovery given the immense scale of the Deep Web.

Complete Article List

Search this Journal:
Reset
Volume 21: 1 Issue (2024)
Volume 20: 1 Issue (2023)
Volume 19: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 18: 4 Issues (2021)
Volume 17: 4 Issues (2020)
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing