Availability and accuracy of distributed web crawlers: A model-based evaluation

Mitra Nasri, Saeed Shariati, Mohsen Sharifi

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

3 Citations (Scopus)

Abstract

Distributed Web crawlers are extensively used for Web mining nowadays, but their accuracy, dependability and other operational measures have not been fully studied. Distributed Web crawlers are costly and require careful selection of configuration parameters. It is important to have some estimation about the performance, dependability and accuracy of a Web crawler. This paper presents a model-based evaluation of the accuracy and availability of a distributed Web crawler whose architecture is based on UbiCrawler. Stochastic activity networks are used for modelling the crawler. Accuracy and availability of the Web crawler are formally defined, and the effects of environmental failure rates on crawling nodes and on the availability of the whole system are discussed.

Original languageEnglish
Title of host publicationProceedings - EMS 2008, European Modelling Symposium, 2nd UKSim European Symposium on Computer Modelling and Simulation
PublisherInstitute of Electrical and Electronics Engineers
Pages453-458
Number of pages6
ISBN (Print)9780769533254
DOIs
Publication statusPublished - 2008
Externally publishedYes
EventEMS 2008, European Modelling Symposium, 2nd UKSim European Symposium on Computer Modelling and Simulation - Liverpool, United Kingdom
Duration: 8 Sept 200810 Sept 2008

Conference

ConferenceEMS 2008, European Modelling Symposium, 2nd UKSim European Symposium on Computer Modelling and Simulation
Country/TerritoryUnited Kingdom
CityLiverpool
Period8/09/0810/09/08

Fingerprint

Dive into the research topics of 'Availability and accuracy of distributed web crawlers: A model-based evaluation'. Together they form a unique fingerprint.

Cite this